Programming Throwdown - 176: MLOps at SwampUp
Episode Date: September 24, 2024James Morse: Software Engineer at CiscoSystem Administrator to DevOpsDifference between DevOps and MLOpsGetting Started with DevOpsLuke Marsden: CEO of Helix MLHow to start a business at 15 y...ears oldBTRFS vs ZFSMLOps: the intersection of software, DevOps and AIFine-tuning AI on the CloudSome advice for folks interested in ML OpsYuval Fernbach: CTO MLOps & JFrogStarting QwakGoing from a jupyter notebook to productionML Supply ChainGetting started in Machine LearningStephen Chin: VP of DevRel at Neo4JDeveloper Relations: The JobWhat is a Large Language Model?Knowledge graphs and the Linkage ModelHow to Use Graph databases in EnterpriseHow to get into ML Ops ★ Support this podcast on Patreon ★
Transcript
Discussion (0)
programming throwdown episode 176 ml ops at swamp up take it away jason
all right folks so this is going to be a bit of a different
episode than folks are used to. I'm actually at the SwampUp conference, which is hosted by JFrog.
And, you know, a lot of people actually probably the number one email or message that we get is,
can you explain DevOps, MLOps? Can I get a job in these areas? It seems like a really good, people see it as a really good gateway to getting into tech.
And we get tons of requests for it.
And so I had this amazing opportunity to talk to four different people with very different backgrounds
who all kind of ended up in this discipline and can share their stories with us
and also explain the technology, explain what DevOps is, explain what MLOps is. And so, you know, we're kicking this off with James Morris from Cisco. So
thanks for coming on the show, James. No, thanks for having me. Cool. So why don't you kind of tell
us, actually, let's start off with, you know, what do you do at Cisco? What is your title? And what
are you responsible for? Sure. So probably my title
is
technically DevOps engineer
categorized right now under
software engineer, which
is a perfect segue because of the blurred lines
versus everybody's, you know, constant
battle like you're referring to.
But yeah, so DevOps engineer
for a team currently called
Enterprise DevOps as a Service.
So we actually provide DevOps services to other teams within Cisco, try to ease that segue for them, whether they aren't quite on a DevOps path or RER, just need specific services that fall under that umbrella.
And just supporting those systems and helping those teams.
Cool, that makes sense. So give us a little bit of your background.
Like, how did you get into where you are now?
Sure. Yeah.
So I went to Guilford Technical Community College,
which is in North Carolina,
and got a degree in both computer information systems,
sort of general IT degree,
all kinds of stuff under that umbrella,
server administration, Windows server, Linux systems, and then also decided on the parallel path for getting the networking technologies
degree, which coincidentally was most focused on Cisco hardware.
So routing, switching, kind of think of the path for getting like a CCNA certification.
Now, what is CCNA?
Cisco Certified Network Associate.
They've kind of
floated around.
It's just
Associate or Admin.
There's a few different
and then there's
CCNP for Professional.
Different layers
and, you know,
kind of starts at the bottom
and I think now
they've split it
into CCIND
but I've long
since been in that
so I'll embarrass myself
as an actual
Cisco employee now
but yeah,
it's the entry level
at least at the time
for me
which I did get that degree in the past was, again, your basic like setting up as an actual Cisco employee now, but yeah, it's the entry level at least at the time for me,
which I did get that degree in the past
was,
again,
your basic like setting up
routers,
which is under Cisco's umbrella.
So,
you know,
learn how to,
you know,
do subnetting by hand
and do it literally
on a Sharpie
and a dry erase board.
Oh, wow.
Yeah.
I was pretty intimidated,
but you learn a lot
and it's a lot
that still helps me today,
you know,
configuring systems and having that understanding
of IP communications and setting up different subnets
and gateways and stuff.
Cool.
And so at some point you were like a system administrator
and then you transitioned to DevOps.
So what is the difference there?
What is that direction? and then you transitioned to DevOps. So what is the difference there?
Like what's the... Yeah, so when I first started,
I graduated around 2009
and then quickly was placed at a company
doing title was VoIP support,
something along those lines.
And it ran the gamut.
It was pretty much somewhere between support technician,
taking first level calls, but also because it was a smaller company doing what traditional sysadmin work
was then so that was something you know around the systems were all Linux based they were CentOS at
the time and you know on top of that you had different applications installed for you know
running various things mostly voice over IP software. So all kinds of system administration
at that time,
like mostly at that company,
it was,
they were wanting somebody
with a Linux background.
So Linux sysadmin,
typically,
and probably still to some degree today,
sysadmin work is usually split
somewhere between like
you're specializing maybe in Linux,
Unix,
all those types of systems
to the more Windows side, Microsoft side, Microsoft Server,
those types of things.
Obviously now it kind of runs the gamut
because you have cloud services and things like that.
So you could still technically maybe be system admin
or some type of cloud engineer
and not really be a true DevOps or DevOps engineer.
The transition there was, I think, about a year into that company
was when I started hearing the buzzwords about cloud.
Like the CEO I worked under
was very much trying to, you know,
keep his finger on the pulse of things
and like to be, you know, very bleeding edge
as far as any kind of new trends
and try to, you know, capitalize on that,
help people that were trying to maybe get to the cloud.
What did the cloud even mean at the time
was sort of, you know of almost the DevOps of now.
People who are in the industry,
they're in cloud.
Now maybe you might hit somebody
who's not in the industry
and kind of get what that means.
But I rarely run into somebody
who's not in the industry
who's just like,
oh, I know what DevOps is.
I'm right.
So that was sort of the DevOps term then.
It was the buzzword.
So that very quickly led into
some blurred lines
of what I would consider CICD,
continuous integration and delivery or deployment.
And that wasn't still true, like what I would call DevOps,
but it was starting to go the route of more automation.
So less like, okay, I need to deploy something.
That might involve opening up something like WinSCP
and copying some files over at the right time.
Okay, we're going to deploy on this Sunday
at some weird hour, 11 p.m.
And then copy these files over,
maybe restart some services.
Hopefully you don't have a lot of those.
If you're a smaller company, maybe you have to do that, you know,
in some for loop for, you know, hours if you do have a lot.
So that transitioned to things like, you know, a lot more scripted stuff.
Things like Ansible were starting to come up where you can just, you know,
write some playbooks, run those playbooks,
and those things would happen automatically.
And so that sort of blurred the line for me, at least, in getting into DevOps,
where it was more, okay, you've got your code and hopefully a good SCM,
something most famously like GitHub, and maybe I have
a branch, something like production or main, something along those lines.
I'm working in my feature branch, and I get
merged into dev and that
dev eventually maybe gets
reviewed, hopefully gets
reviewed by one or two people
and so on until it maybe is
slated to be merged for production.
And then in my
idea in true DevOps,
there's
at least at some point where
you're going to merge that into your production branch, your production branch, whatever that's called.
And a lot of things are going to start kicking off automatically.
So instead of this guy who's just waiting for whatever time,
he's just frantically logging into systems and clicking and dragging.
If it's, you know, UI based, you're running for loops and, you know,
something like a Linux CLI or terminal,
it's going to, it should be a lot more automated
and maybe have some checks and balances.
So if this thing fails, don't do this
and kick off an email
or maybe a lot of people use Slack or things like that.
We use Cisco, we use WebEx,
and maybe it posts into there that something succeeded or it failed.
But there shouldn't be a whole lot of manual if not any in a really ideal situation manual intervention or you know manual with some
people deemed click ops where you're just you know in there manually fixing things or changing things
that makes sense so would it be fair to say like system administration is is it fair to say that it's like the part that is left over
because it's either not worth automating or you just haven't automated it yet.
And DevOps is this automatic tool chain that covers the rest.
Is that like the right way to frame it?
Yeah.
So there's, and that's why a lot of people, you know, kind of get in this philosophical
argument about, you know, there's so many so many things that came about around the same time.
So when I first started, all the things now that sort of fit together, like DevOps, CICD, even agile development, weren't barely even being talked about.
I mean, again, having some light buzzwords like cloud, some of those have been being pursued from an idea, but not with that phrasing or those terms.
And literally within my career,
watch it actually evolve into those things
where people are using those terms daily.
So a lot of those things in my mind get blurred
because they sort of all came out at the same time.
Although technically they're completely separate.
You could be doing Agile for example,
but not actually be
using any DevOps services.
I can't imagine that world.
I've never seen it in person.
I've seen maybe DevOps not being used
ideally or have some pieces
missing that would make it better.
Similar to
CICD, you could be technically doing
infrastructure as code,
like something like Ansible,
and maybe even doing it with CICD
or like the Git flow that I was talking about,
but not really still be doing
all the other parts that come with DevOps.
So they don't have to go together,
but sort of the complete package in my mind
is doing something like using an agile development
and then kind of pairing that with CI CD and DevOps
workflows. But in general, the idea is, of course, that you're using like the most common tools for
DevOps would be something like GitHub, GitHub actions, right? So now I've that merge, I was
talking about that's getting kicked off maybe by an actions workflow. And that actions workflow
just means like, for example, if I'm using one of the examples earlier,
you know, SwampUp, I think in one of the trains
they were using, it's pretty common, Node, Node.js, NPM.
So that NPM build might be kicked off actually by action.
So somebody sitting there actually running
all these NPM commands and then, you know,
pushing that somewhere, you know,
maybe like JFrog Artifactory
and then some system,
then again, pulling that down to be in production.
You could do all that manually
and be still using those tools.
So DevOps isn't just using those tools, right?
CICD isn't just having that flow.
Again, there's this chart that usually comes up
or visualization rather,
where it almost looks like,
imagine an infinity symbol where you know as you have that deployment they you want that immediate
feedback so as soon as you've got a deployment and you're already sort of kind of on the path
to the next release because it's sort of this you know smooth transition right so you can have
lots of minor iterations and sort of break the barrier that existed when I started with the traditional C7.
I rarely talked to any developers unless there was a problem, right?
Systems down or whatever.
That was sort of the typical MO versus, oh, I just need maybe some change or some help on actions.
There's a problem.
I need new permissions or something but if they're given all the right permissions and they've got you know managements and checks
and balances on things like you know pr reviews and things to get into that automated production
workflow then that's where that devops term in my mind kind of comes out because the the developer
side the development side and operations ops are sort of blended
because as a developer,
even though maybe I don't know too much about
the kind of stuff I used to do,
Linux administration, Linux OS,
and how these files get where they need to be.
Do they have executable bits?
I'm not worried about all that.
I just know that if I merge this PR to main production,
whatever it is, that all these things kick off.
And if there's a problem, I don't
understand, maybe I call, you know, DevOps support
or whoever it is, maybe
it's even just a technical lead or somebody
that knows, like, oh, that error just means this,
you know, start a new
feature branch and fix
this and try it this way and that.
So it kind of takes that
out of the aspect, at least in that
flow. It doesn't mean that nobody's setting up those systems
that are running it,
because that's kind of where my current role
as a system comes in is like,
we're deploying things like we're here at JFrog Swamp Hub.
My main role is managing, deploying,
and maintaining Artifactory itself.
And that's still, in some cases, on a server.
We both have a server in the cloud
and all kinds of different deployments.
Those still have to get deployed by somebody.
But once all those things are kind of interconnected,
they allow that continuous workflow
for that developer to not have to worry about those things.
Yeah, that totally makes sense.
I remember like, I've seen this through my own career.
I mean, I've mostly been kind of on the research side,
but I have, my wax wings have put me a
little bit too close to the sun every now and then i have been burned by uh the production gods every
now and then um as we go and uh yeah and uh you know in the early days they would earlier days
of my career they would you know build a new build every week and it would be a
manual process and there was an irc channel and you would just get pinged like hey you know you
made this change and this file you know doesn't compile you know go take care of it um and then
it kind of as you said it's become more and more and more automated to where um you know most
companies are just pushing new versions all the time
and and they're even automating the the failure so so it's like the contingencies are all automated
and so now what you're doing is is just orchestrating this self-healing machine
which is really impressive to watch um and to to be a part of so now it's as you said it's more like
you get a slack notification uh hey you know this this file that you changed that you know
you is actually being used in this other system you might not even known about and it broke it
and so you need to revert it that kind of thing but it's it's very automated now it's really cool
absolutely i think automation is a huge part of it for sure um and then it's it's very automated now, which is really cool. Absolutely. I think automation is a huge part of it, for sure.
And then it's also the sort of the interlinking
between those automations.
Because automation's predated DevOps
and the terminology there.
But I think the true thing is,
kind of going back to that infinity loop diagram
or visualization, whatever you want to call it,
that a lot of people bring up.
And I've seen it a couple times even here this year
and last year.
I was lucky I was going to small
last year.
And similar,
it's that how they're all intertwined now
when they're all playing together well
and then they've been configured right
and you're taking advantage
of all the modern features.
You really shouldn't have to be doing much
unless there's, again,
there's a problem.
A lot of cases now
a lot of what has been showcased um in both the last year solve up and this year the big focus
on security so there's the term now devsecops um and several similar type terms but the idea that
somewhere in that flow that we were just talking about something should be scanning right so right
some failure may not actually be that somebody technically did something wrong they just did something they shouldn't have you know maybe
there was a secret in something maybe there was an old version used for something going back to
that npm maybe they used a version that's going to vulnerability in it so you want to see that fail
so somewhere in that pipeline you either have a plug-in or some version of you know if it's
good of actions maybe there's an actions for, you know, like J
frog, but if we're here,
they have their own
called frog bot that
has a lot of
integration with
artifactory and x-ray
where, you know, it's
making sure those
vulnerabilities don't
exist and you can even
have it, you know, block
the pipeline or maybe
just email somebody if
it's, you know, the type
where maybe it's just a
development flow, you don't want it to block, but you do want somebody to know,
hey, this should not go to production.
That makes sense.
What about like, you have this really expansive background in DevOps.
And so how do you see MLOps from your perspective?
What is the sort of delta there?
What is the part,
like,
what's the leap
from one to the other?
Yeah,
it's a really interesting one
because I have,
I've been learning a lot.
There's been a lot of focus
on that with this
small bump.
So,
I've learned a lot.
The keynote was very good
in that.
I think they did a good job
with their diagrams
on literally showing
those two differences.
So,
for my background being very, since I was more sysadvent, like a lot
of people, it's interesting, especially with the term.
So people come in more from the development side, want to see
more operations. So they get into DevOps
from that. Obviously, I'm the other
boat, which is people with sysadvent
backgrounds or some version of operations getting
in. And then, you know, the development
side kind of comes in there. So
not having much background with the um on the ml side or ai i can at least say that what i've
learned already and even with within today's uh talks is that there's gonna be a lot uh different
word flows so there's a lot more parts and pieces with everything i was just talking about you're
not just gonna have like oh i've got my good repo, I've got my code. I push it here,
it runs some commands,
it builds stuff.
And eventually,
as long as it gets enough actions
or if you're using Jenkins
and you've got the right Jenkins plugins,
things going to get my app deployed.
Not to say that's simple.
And to downplay some of the more advanced apps,
when you're talking ML ops,
there's a lot more at play.
Like, am I using a public model
from something like Hugging Face? know something being developed in-house completely um from scratch
and have the resources for that um obviously there's so much out there that's a lot less
common but still plenty of people especially depending on how sensitive the work has gone
so there's so many more moving pieces but i think the idea is still the same you're just going to see especially as
as j frog is already integrating with some of that that it shouldn't be as
awkward as it was in its infancy where a lot of that's going to be customized stuff like
now it should be again a very similar workflow but my short answer would certainly be that there's going to be a lot more pieces there
that you're going to have to integrate.
Even just from a storage perspective,
the models are a lot larger
than just pulling maybe some Python libraries.
A lot of those people have worked with Python,
they've run an import,
doesn't crash your machine,
doesn't fill up their hard drive, right?
Doesn't take many models to fill up
at least a modem modern slim book you know
that we're right you might just have you know 500 gig or something on your hard drive maybe three or
four models could easily fill that if not just one large one yeah so now you're thinking okay we're
gonna have these properly set up to be stored correctly and optimizing so there's a lot more
at play at the very least than just a standard DevOps work. That makes sense.
And so we have a lot of folks listening
who are just starting their careers.
What advice would you give for people?
Let's say folks are in high school
or maybe folks are thinking of going back to college
and they're deciding between that and a coding bootcamp
or learning on the fly.
What advice would you give for young folks
who are just getting started
if their goal is to get into a DevOps career path?
Yeah, I've seen that question a lot,
even a couple of friends personally
have asked that question, so I've given it a lot of thought.
I think depending on what your background is,
it's going to really sort of change that answer, at least to some degree.
But I think the one common thing is certainly to just don't take for granted the wealth of information that's out there, whether it's, you know, just free resources like YouTube, tons of great videos that teach you all the different pieces of things like you know github github actions jenkins um you know just all the different you know you would i wouldn't necessarily
get too caught up in well i need to master something like a specific language whether it's
python or whatever but it's going to take at least the common core understandings of some
you know programming language and having you know the basics of that because no matter what you're doing even on on the side that i am or
um you know maintaining something like artifactory uh you're going to have some language whether it's
for scripting or for automation um so having some good basics of that and then also just again
taking advantage of all the information that's out there on things like YouTube and Udemy that are very affordable, very accessible.
When I started, you would go find the closest relevant O'Reilly book and have it by your desk.
It wasn't very common to see just a few O'Reilly books or a whole shelf of them in whatever team's room that you worked in.
And you'd go grab one off the shelf.
Not to say that it wasn't available, but that was still sort of the habit.
Like instead of searching to retrieve of maybe like various posts and things like
SAC Exchange that, you know, at least early on, it may have been, you know,
more difficult than now, especially with things like ChatGPT or anything GenAI.
So don't take for granted the amount that is out there.
Just use that to your advantage
and at least learn
those common tools,
GitHub and some type of version,
something like Actions or Jenkins
that help you automate those things
and just start playing around.
A lot of those things right now
are free for open source and
students and things like that.
I should be able to at least break in
at minimal cost or
even free in some cases.
Yeah, totally.
I use GitHub Actions extensively for a lot
of my open source stuff and they haven't charged
me yet unless there's some mounting bill somewhere.
Right, yeah. Unless they get a collections
call tomorrow, we'll see. Yeah, so that's the great bill somewhere. Right, yeah. Unless they get a collections call tomorrow, we'll see.
Yeah, so that's the great thing.
When you're learning, don't be frustrated when you break things.
Part of it is see how you can break it.
Break it different ways and learn how those broken things are solved
and how you can figure out, change ways to keep the automation going
even if it hits something that's not critical. critical maybe it was meant to you know continue on with whatever and that's sort of
the idea like don't don't get frustrated as things failing like figure it out and use that as a
learning opportunity basically yeah definitely yeah this is uh this is great so i think you know
one common thread that we've talked about on the show that we're just double clicking on here is
is you know build cool stuff that's basically the the short of it if you build awesome stuff you're going to
have to maintain it i guess that's one thing with with devops is uh um i guess you'll have to build
something for other people or build something where where you know you need to have a process
um but even if you don't i mean, you're still going to get
a great experience as a developer
and that will catapult you
into a DevOps career somewhere.
Yeah, absolutely.
Cool. Hey, James, I know you have to rush.
Thank you so much for taking the time
to talk to us.
I really appreciate it.
And if I get any questions for you
from the audience,
I'll shoot you an email.
Yeah, I always have your help.
And it was great being here and appreciate it happening.
Cool.
Thanks a lot.
Hey, everybody.
So we are here at SwampUp with Luke Marsden, who's the CEO of HelixML.
Thanks for coming on the show, Luke.
It's great to be here.
Cool. So why don't you give people a little bit of a background into how you kind of arrived
into starting Helix ML and what's your kind of backstory?
Yeah, absolutely. So I'm a startup guy. I've been doing startups my whole life. This is startup
number three. Back when I was 15, I started a web hosting company. Wow. And I then went and did computer science at Oxford.
And out of the back of that experience,
I was really inspired to try and solve some of the practical problems
we had in the web hosting company.
And I did that by building a distributed web cluster.
And then that evolved into,
we ended up pivoting that business into solving storage for Docker
because when
docker exploded we were like we already have all of this tech for um dealing with stateful
containers because we were using freebsd jails and so we just applied that technology to um
to docker well let's dive into a little bit so we have a lot of uh high school folks listening in
so 15 years old start a company how. How does that work? How do
you go to somebody and say, you know, I will build this website for you and you should write me a
check. Did your parents help you with that? Or did you like, how do you do that as a high school student?
I mean, I had a co-founder who I met online on IRC And that's like the old school way
that we used to communicate back into the ground.
I used to love IRC.
It was great.
I mean, I guess now it would be Discord, right?
Yes, exactly.
That was the Discord of our era.
Yeah, exactly.
And yeah, we just decided together
to put together a web hosting company.
And the amazing thing about that
was that we were just able to put it online,
put the website up there,
and start telling people about it.
And people showed up.
And one really fun thing happened
that helped us a lot with that business was,
I've forgotten what it's called,
but there's like this old Perl-based blogging framework.
And the guy who created that blogging framework
found our web hosting service.
And then he left his company and then he said,
I'm really happy with my hosting service
in front of like all of his audience.
And so we got a nice boost in traffic from that.
That's amazing.
Yeah, it was very lucky.
I don't know if I've ever told this on the
show but but uh i did a lot in high school as well and one of the things i built was a little
isometric game engine oh yeah and uh at the time there weren't a lot of game engines this was
this was uh like 1997 or something and so there wasn't a lot of Java, really anything. Yeah. And so it got picked up in a book and it got popular.
And I wonder if, I wonder if people can just build things and get them noticed or if that
time has passed.
I feel like now you probably need to do a little bit more promotion.
There's not, there's just so much content out there now.
I mean, I, I think that's true, but true but um i think if you have a passion for
solving a problem then um give it a go and if anything there's probably more ways of of getting
the word out these days i mean we didn't have youtube back then like we didn't have
um discord communities like i think it's true um uh and more people are online now than
were then so yeah another thing is like there's the vanguard has moved so in other words if you
want to make a java game engine i'm sorry like it's just there's too many of them yeah but but
there's a hugging face uh llm leaderboard yes and if you you know there's a variety of them
so you might end up with some passion
and some talent, you might win
the Greek
LLM leaderboard
Absolutely
So shall I continue my story?
So yeah, I mean
we did the
storage for, we pivoted that
business
hybrid cluster into Cluster hq which was solving
um storage for docker and then we got involved um in the very early docker and kubernetes days back
before um people really knew how to make um containers work with like databases and other stateful services.
So at that point, yeah, we raised $15 million just by walking up and down Sand Hill Road
and saying Docker and storage in the same sentence.
Now, what is Sand Hill Road?
Oh, so it's a road in Palo Alto
where a lot of the VCs live or work.
Yeah.
So, yeah, there's a large concentration of venture capitalists.
Very cool.
I mean, was that intimidating?
I mean, that sounds just like an extraordinary amount of money and responsibility.
It was intimidating.
It was interesting.
And I think I learned a lot from that experience.
I think the biggest thing I learned was that even if you have a large Series A,
you need to be really, really thoughtful about not growing the company too quickly before you
really truly have product market fit. And so, yeah, that was one of the lessons learned. Like
we grew the team quite quickly and it was challenging. But then beyond that first company. I then had another go. We had a...
We built...
We started out with
this idea of
versioning for
development environments.
Okay.
So a lot of my career
I've spent trying to find commercial applications
of ZFS, which is a very clever
bit of file system technology that came out of some microsystems and then got ported to Linux.
And so that attempt to commercializing ZFS was, well, if you've got a development environment
and you manage to reproduce an interesting bug, let's say, on your laptop, then shouldn't you be able to
not just do a git commit of the code at the point at which you can find the bug,
but also take a snapshot of the local development databases
that you have running, so that you could maybe attach a runnable snapshot of that thing
to a GitHub issue, and then another developer could just pull it down and reproduce the bug immediately rather than having to
click around in the UI to
get the database into a certain state.
Turns out
that AI and
machine learning had a much bigger data
versioning problem than DevOps
did and software engineering. And so we
ended up pivoting to
building out this end-to-end MLOps platform
but with that same idea of
versioning your workspace and so what data scientists like AI and ML people often do
like when they're developing ideas is they use Jupyter notebooks but when you're using a Jupyter
notebook it's very hard to keep track of your work very accurately and even like the order in
which you run cells can affect the output and things
like that. But what we did was to add this snapshotting of your state before and after
you did a run. So if you did anything in the Jupyter notebook that would like train a model,
for example, then we'd snapshot before and snapshot after. And then we'd also automatically
build up this provenance graph so
you could say like oh i created this model from this data but this data was transformed from this
other data using this process um and so you can kind of recursively build up this um uh this tree
structure of how you got to that point um so yeah that that was uh that company was called dot science um and um
yeah uh that was i love that name is it like the configs were in the dot science folder
home directories how the name came it actually got really confusing because people would always
put like a period and then science right but it was actually dot so it was oh that was a lesson in naming yeah department of transportation
science yeah you can see other ways that that could get misconstrued yeah so a lot of people
might not know this but zfs and btrfs yeah raw at fs they have this amazing property called copy
on right and the way it works is uh you've probably done this even for
school projects and you might say like i have a document um i mean nowadays probably everything's
in the cloud so it's kind of transparent but you might say like i have i have a set of artwork
um that i'm working on some digital art and uh you know i'm iterating on it but i don't want to
lose my history and it doesn't really make sense to create a Git repository.
I might not know how to do that.
The simplest thing would be just create a folder for each day
and just have my Blender files or my Photoshop files
just copied in each folder.
And then that starts to become really expensive
because you might have all these other assets
and they're just all getting copied every single day end up using up all your disk space do that exactly
zfs is has a genius idea it's basically reference counting so when you copy a file zfs doesn't
actually copy it it just creates a shared pointer between those two files. Now, as soon as you change even one bit of that, of either of
those files, then ZFS has to make a copy. But if you don't do that, you can have 100, 1,000,
1,000 copies of the file, and it's not going to increase your storage costs linearly like
a regular file system. And so I've always been fascinated by that. I really was intent to install BitRotFS
on my latest computer that I got about a week ago,
but it wasn't supported by Grub or some other issues.
And so it's to your point,
like it's still not totally smooth yet.
And it's been around for a long time, unfortunately.
Yeah, well, I mean, Ubuntu embraced ZFS on Linux, you know smooth yet yeah it's been around for a long time unfortunately yeah well i mean ubuntu
uh embraced zfs on linux fortunately and so you can install ubuntu with a zfs root um right so um
yeah i mean i encourage anyone who's brave enough to give that a go and you can do all these cool
things but i mean we run like uh my newest company, HelixML, we run all of the infrastructure
for that in a data center in my basement. Really? Yeah, because we're bootstrapping the business.
So we didn't want to incur a ton of cloud costs. So I mean, it's not got a fiber link. And yeah,
we use ZFS on Linux for all of the production storage for that and then actually rsync.net this is an interesting piece of ZFS trivia rsync.net has support for
they're a backup provider but they have support for like ZFS send into rsync.net
and then they provide that as a service so what that does is it means that you can take a snapshot every day
of your production system
and it's an atomic, reliable snapshot
that doesn't take up any extra
disk space like you were saying. And then
you can just send the difference between that snapshot
and the previous day's snapshot over
to rsync.net and it will automatically
keep up to date.
Anyway, we've ended up falling down a
file system rabbit hole no that's fine
okay one last file system question because it's just burning in my mind what is the difference
and feel free to just cut me off so you can take a rabbit hole turn but what's the difference
between zfs and bitrot fs are they totally different things or are they two people trying
to do the same thing well i like how you describe it as Bitch Rot FS.
I think that's what it is, right?
Or no?
I think that's a joke.
Oh, really?
I thought that was the official...
I think it's called Butter FS.
Oh, you are totally calling it...
But I think you fell for it.
You know, I have to admit,
I've said this on Show Local Times,
I am known to be the most gullible person.
You know, different people have their
superhero strengths and weaknesses.
Like I fall for everything all the time.
So I'm not surprised.
Well, no, I mean, I don't mean it in a negative way.
I just think it's, I can imagine someone on the internet calling it
bitch rot FS as a joke because it's like bitch rot.
It's actually, I looked it up.
It's better FS.
Yes.
That is the official name.
Or BtreeFS, which probably makes the most sense, right?
Yeah.
Well, I think BitrotFS is actually quite apt because it's, yeah, I mean, that project has
been plagued with data loss issues and I've never really trusted it
to actually be reliable.
Whereas ZFS was engineered at some microsystems
like very nicely.
Interesting.
And I was grateful when I stopped having to run Solaris
to use it.
Because it got ported to FreeBSD
and this is like back in the day
when we were doing this web hosting company.
But yeah, anyway.
Well, that is fascinating.
There were licensing issues, right?
But I guess it's all cleared up.
Yes.
And I think the, yeah,
I'm glad that Mark Shuttleworth at Canonical
kind of took a stand on that.
And he said like, we're going to go ahead with this.
Our lawyers have cleared it. And that like really cleared the way for a stand on that. And he said, like, we're going to go ahead with this. Our lawyers have cleared it.
That really cleared the way for a lot of other people to say,
like, okay, if it's good enough for Ubuntu,
then we can use it.
Nice. That is great.
That is Ubuntu's real big contribution
is just getting everybody to have faith
in the whole ecosystem.
So, okay, we talked in previous episodes about DevOps.
We had a dedicated show about it.
We've interviewed some folks about DevOps.
For folks who are listening, if you missed the DevOps episode,
hit the pause button on this episode.
Go back, listen to the DevOps episode
because what we're going to talk about now
is the delta between
DevOps and MLOps. MLOps is much newer. You know, we haven't spent a whole bunch of time on it on
the show. But as folks know, AI is becoming, you know, really, really important. And so MLOps
itself is also becoming, you know, important by proxy. So what is MLOps and how does it differ from DevOps?
So if you think of three disciplines,
software engineering,
just like how you write and develop software,
how you test it, how you version it,
and you think about DevOps,
which is how you do CICD for deployment,
how you operate that, how you do CICD for deployment, how you operate that,
how you do immutable infrastructure in the cloud and things like that.
And then you layer in this third discipline, which is AI ML, which is like this world of
using data to generate models that pull the patterns out of that data and are able to
make predictions, right?
ML Ops is the intersection of those three disciplines.
So it's the intersection of software, DevOps, and AI ML.
That makes sense.
And so if somebody wants to get started in the field of MLOps,
how do they not get uh overwhelmed with all three
because all three of those are you could spend a lifetime they're all crafts of their own i think
data science and and data engineering is a craft software engineering is a craft you know ai and
you know making the loss go down it's a craft craft I've spent many, many years on.
So how do people get in that intersection?
Is it the kind of job where it's really more of a mid-senior level
or can folks kind of get into that and what would that look like?
I would say that in order to become an MLOps practitioner,
you only need a little bit of all three.
So for example, if on the software side, you learn a bit of Python and you're comfortable with Git, on the DevOps side, you get comfortable with Docker and containerizing things and
maybe deploying Docker Compose, maybe push out into Kubernetes.
And although I'd say Kubernetes is optional.
And then on the AIML side,
if you just like train a linear regression model
or something in PyTorch or like XGBoost,
I mean, that's enough to get you started.
And then you can start looking at tools like MLflow,
which allow you to keep track of model artifacts
and track runs.
That was something we were big on in my second company,
DotScience, was this idea of run tracking.
And yeah, deploy one of these models into production
using Git, Docker, PyTorch or whatever,
and now you're an MLOps engineer.
And then you can take it from there
and there's lots more sophistication
and how you scale the systems and so on.
But basically, that's what you need.
So it might sound intimidating
because it's like three different disciplines.
But yeah, if you just take a little bit of each one um then you can get up and running and i would uh shout out to the
mlops community um so mlops.community uh is actually a community that we started at the end
of dot science um actually like right when the pandemic started uh our sales pipeline dried up
and we were like what are we going to do so going to do? So let's start an open community around MLOps.
It's like, I'd always wanted to do that.
We bootstrapped that community from nothing with another tech community called BrizTech
in Bristol, where I live in England.
And then my colleague Dimitrios took the MLOps community and ran with it.
And we're now like over 20,000 people on Slack.
Wow.
We've got a meetup in San Francisco this Thursday, which we're hosting.
And there's meetups all over the planet.
So it's amazing how these things can evolve.
Wow, that is remarkable.
But there's tons of really great material on MLOps, on the MLOps community.
There's a good YouTube.
And yeah, I'd recommend that as a
resource yeah that is great i love how like you took such a kinesthetic approach to it i i'm right
on the same page i think uh um i'll read the book when you know i've kind of uh have something half
built and it doesn't work and it's like okay now it's time to read the manual i know some people are the opposite i used to report to my manager used to be uh peter norvig at google and he would he would
read the book first so it's like oh we're going to use python okay i'm going to just start reading
the python manual from page one yeah my wife is like that yeah yeah i mean but uh i think that uh for me like what works is starting with some
kind of problem and what i've learned over time is actually even better than that and i'm this is
where i'm not um there yet but i'm trying is actually starting with a customer yes you know
i think uh starting with uh you know a person who has a problem
but
you know
before that
I think using yourself
and saying
okay what is a problem
in the real world
for me
starting with that
and working backwards
to you know
what AI thing
do I need to build
and then
what language
do I need to build
the AI thing
and go from there
well if I may tell
the story of
the most recent company
like Helix amount
yeah definitely dive into that so like um uh yeah after dot science i did consulting for a few years um
and worked with clients all over the world which was great i really recommend it actually like
being a one man or one person consulting company can be really amazing. But then what happened, I was watching this kind of open source AI space
kind of towards the middle to end of last year,
like August, September, 2023.
And I saw these two really interesting things happening.
The first one was that MrL7B came out.
And now suddenly, suddenly you could have a good quality LLM,
like a chat GPT level-ish LLM
that you could run locally on your own machine.
And the other piece was that it became possible to both run
but also fine-tune those models on consumer hardware.
So you could now have an almost chat gpt level um model that you could
fine-tune on your own private data on like a single 3090 like the kind of gaming um gpu that
you might have in your in your home pc now like let's dive into that a little bit for folks so
you know a lot of people have heard the word fine-tuning yeah but like how do people actually
do that here's i'll say what i think it is and
you can correct me and fill in the gaps so when they trained mistral they had some pytorch or
tensorflow or mx net whatever it is they had some code to train mistral they give people that code
and so you can basically run that code on the current model and basically continue
training on your own data is that is that pretty much how it works exactly so fine tuning is just
more training okay um so you take the weights of an existing model um and you train it more
on training data that is your own private training data um it is a little bit more complicated than that because there's a technique
that's often used called low rank adaptation, which I don't fully understand the math, but
it's some sort of matrix decomposition where you end up just having to train like a much smaller
set of weights than the whole set of weights. And that's really cool because it makes it tractable to train or
to do more training on this model
but without needing huge
memory requirements. And that's what I mean by
it became possible to fine-tune
Mistral yourself on a
single GPU. And that
depends on that kind of lower...
So is it holding the original
model in the CPU memory and
then the GPU memory has the low-rank version?
That's a really good question.
I mean, I think everything fits in GPU memory,
but I think it only needs to do backpropagation
on the smaller matrix.
Oh, that makes sense.
Yeah, so it's able to do that with fewer resources
and it also just takes less time.
Right, that makes sense yeah because
for folks who don't know the um the way the way back propagation works and we talked about this
a little bit in the ai episodes but you keep this gradient matrix so effectively doubles the amount
of memory you need because for every matrix you need this sort of shadow right yeah and if and
you might need to up the resolution
too like maybe the model can run an 8-bit but for training it has to be 16-bit and so so that now
you're talking about a 4x multiplier and so if you can just do inference on the big model it can sit
in 8-bit yeah and then do the training on the smaller 32-bit or 16-bit model. That is really cool.
Yeah, so I noticed these two really interesting things
happening in the world.
And I turned to my friend and co-founder Kai in Bristol
and I said, it's time to have another go.
So was he with you at Dark Science?
Yeah, we kind of got the band back together.
So for helix um
yeah we um we saw this this opportunity and we went in and we we then like spent two three months
furiously hacking together the stack i mean that we we joked that it took us 10 years to know what
to build in 10 days yeah that's how it goes it goes. But yeah, we put together this stack that
allowed you to deploy these open source models like Mistral 7b and also fine tune them. And to
make that easy for people with like a nice web interface where you just drag and drop in some
PDFs or some documents. And then we did this interesting piece around the fine-tuning where we would take the source documents, chunk them down into little pieces, and then we would use an LLM to generate training data from those source documents.
Because what you want to do when you're fine-tuning is train on data that's similar to the kind of questions that a user would ask of it,
assuming it's an instruct style model,
like a question answering model.
So you can't just train it on the raw text
because then it will just be good at completing the raw text.
But what you need is you need to train it on things
that are like the questions that users are going to ask.
So we actually use another LLM
to transform the source documents into
questions and answers about the source documents. And then we use those question answer pairs to
fine tune the model. Our most popular blog post was how we got fine tuning Mistral 7b to not suck.
And that goes into a bit more detail. So maybe this is a dumb question, but these models are known to be trained
on enormous data sets.
And I think there's one called the pile.
It's like 10 terabytes of text or something.
Some enormous amount.
How does your fine tuning have any statistical significance
when you compare it to just this enormous data set?
Like how are you able to,
to move the model, move the needle right on the model? Yeah, so I think it has to do with like,
the learning rate that you choose, when you're doing the fine tuning. And what, what has been
found kind of empirically is that even just using a very small number of samples
in fine tuning with like, I guess, a relatively high learning rate. So it makes them makes the
model change quite a lot, but not too much. Because if you go too far, then it just goes like
haywire and starts spouting garbage. So you've got to find this like middle ground.
But even with a small number of samples, you can get the model to start generating things that are similar to the fine-tuning dataset
or start generating responses in a different style
or with a different structure really quite quickly.
And then it's about finding that trade-off.
You don't want to over-bake the thing
so that it just kind of memorizes that
and forgets all of its prior learning from from uh from the pre-training um makes sense but
yeah we we like tuned those parameters and we found something that worked um but then the
interesting thing was i guess commercially from a business perspective is that we went into the
market in december last year um with two hypotheses the first hypothesis was um let's uh people will
care about running models on their own infrastructure.
People will care about local LLMs.
And the second hypothesis was, people will care about fine-tuning, and in particular, fine-tuning for knowledge, which was that piece that we developed.
And what we heard back from the market was a resounding yes to running models locally so we launched on december
21 um uh on december 24 uh this company shows up on our discord and they're like we're based in
germany we're really interested in self-hosting these models we're not ml experts but we see the
opportunity and by january 1 they'd integrated Helix already into their stack.
Wow, this is incredible.
And then they were going after big enterprise customers
in partnership with us.
And so that was really exciting.
It's like the universe was telling us there is an appetite
for running LLMs locally and building the capabilities to make that easy to do.
Now, in terms of the second or the first, well, the second hypothesis about fine tuning,
we actually got a big meh from the market. Like we put all this effort into fine tuning stuff.
Everyone just said like, oh, we just want to do RAG, like retrieval augmented generation.
Yeah, interesting.
Why do you think that is in hindsight?
I think because fine-tuning for knowledge
is slower than RAG,
and the results are not as good.
So it was kind of an experiment to see,
oh, well, maybe people will care about it.
But yeah, I mean, and so what we did was we extended the stack.
So we added RAG to it.
And the other thing we did was we added API calling,
which is where you can give the model an open API,
which is confusing because it's not open AI,
it's open API spec.
It used to be called Swagger.
A Swagger spec.
It's easier to just describe it as a Swagger spec.
So you give the model a Swagger spec,
and then you give it a query from the user,
and you basically say,
a bit like how function calling works,
you say to the model,
you've got these three APIs you can call,
and please class...
So it comes in three parts, the API calling.
The first part is classify the user's question
and then tell you whether the user's query requires an API call.
Yes or no, and if so, which one?
And then the second step is constructing the API call.
So assuming that you want to make an API call based on the user's question, like, can I rent a crane that can handle three tons in Hamburg next Thursday,
like construct an API call that will query the product catalog with the correct question. And
then the third part is taking the response from the API. So the system will execute the API call
for you. And then taking the response from that API and summarizing it back to the user.
So from the user's perspective, they're just saying, like,
hey, have you got a crane?
And the model quickly says, yes, I've got one.
I've got these three available.
They cost this much.
But what's actually happening underneath is that classification,
API call construction, and summarization steps.
Interesting.
So that API calling feature has been really commercially successful for us
and tons of people
are interested in that.
And so I'll just shout out quickly to
the fact that we did our 1.0 launch
last week. So yeah,
if you're interested in running models locally,
we now run
on Windows,
Mac, and Linux. You can run it alongside
Ollama. We have a nice application editor,
so you can click buttons to set these things up
and add knowledge with RAG,
add API integrations, and so on.
So if people don't have, let's say, a GPU,
could they still run Helix on, let's say, an EC2 instance?
Yes, and you can even just run it on CPU
because Olamo runs on CPU as well.
Oh, okay. Got it.
However, if you have a Mac,
like one of the more recent M1, M2, M3 Macs,
Olamo also works with the GPUs on those machines.
So you actually get really quite good performance on a Mac.
On a CPU Linux or Windows machine, it will be a bit slower. But if you've
got like a gaming rig at home running Windows, then you can install WSL2, like Windows subsystem
for Linux and Docker, and then the whole thing does work. Yeah, it's pretty cool. That is awesome.
So yeah. Or you can dual boot and have a Linux machine with ZFS on it. Yes. That is another option.
That's really cool.
So how does that work?
It's like a desktop or a library.
I guess I'm trying to figure out.
So in the case of OpenAI, they give you a key
and you're calling into their server with their key
and that's how they do billing and all of that.
In your case, how does that work?
Is it, I guess, also a key, but you run,
you get the key from from helix.ml i so we are docker desktop license so like if you have less than 10 million dollars of annual revenue then you can use us for free oh perfect just download
it on the website like and uh it's um it's an install script so you just like run our installer
that creates a Docker Compose file
you run Docker Compose up and then you
look at localhost in the browser
and the whole thing is running there
but that means of course you can then also
deploy that if you're a company
you can deploy that on your own internal infrastructure
and that's where the DevOps piece comes in
we also support Kubernetes
so you can run the whole stack on Kubernetes
and yeah we use Kubernetes We also support Kubernetes, so you can run the whole stack on Kubernetes.
And yeah, we might use Kubernetes with a bunch of GPUs in production for our own service.
Very cool.
Yeah, this is fascinating.
Okay, so maybe we'll wrap up with like, what's one last piece of advice for, let's say someone's listening right now and they are completely infatuated with this.
They want to get into MLOps.
We talked a little bit about things to study and all that,
but in terms of maybe more of a mindset
or kind of lessons you've learned on the soft skills side,
what are some advice that you can give to folks out there
who want to get into this field?
Yeah, I mean, I would recommend
joining the MLOps community
or other communities
to have peers to talk to about it.
And then, yeah, I would say
start playing with the technology.
Like I said earlier,
if you want to get into ML Ops,
like play with Python, Git, Docker, PyTorch,
that kind of thing.
If you want to get into this newer field of LLM Ops,
like how you manage RAG and API integrations
kind of on the other side of the LLM API boundary,
if that makes sense,
then download Ollama and spin up an LLM locally.
If you like, like play with Helix
and set up like an API integration.
And then that'll set you up well
to be able to go into maybe like a job interview or something
and say like, I've got this experience
or to build a site project.
Yeah, totally. Amazing advice. or something and say like i've got this experience or um or to build a site project yeah totally amazing advice i think uh we covered um we actually covered docker in the kubernetes episode which is
uh you know we should have in hindsight maybe had a dedicated docker episode uh but check out the
kubernetes episode if you haven't already uh we talk about minikube and there's probably better
things out there that episode is a little dated now that's okay minikube's solid is this still the thing or
there's kind as well that's right yeah used kind recently um but uh yeah get get set up on your
machine and uh check out helix ml totally free uh for folks out there um if you're going to install
it on your work computer and you work for like Google or some company
that has a lot of revenue,
you probably need to check with your boss first.
Or just come talk to us on Discord.
Yeah, or just talk.
Yeah, exactly.
If you're a hobbyist,
just install it and get started and try things out.
Yeah, awesome.
Cool.
All right.
Thank you, Luke.
Thank you so much for your time.
It's been awesome.
Thanks for having me on.
Cool.
All right. So as part of this
three-parter you know we uh we just had uh luke on to talk about um you know what is ml ops and
kind of get us started on that and explain his story and now i'm really excited we have yuval
fernbach here who's the cto of jfrog, and he's going to explain to us more about the whole kind of supply chain of AI software.
So thanks so much for coming on the show, Yuval.
Thank you. Thank you. And really nice to be here.
Cool.
Yeah.
It's a pleasure to be here.
Yeah. And just to recap, we are at the SwampUp conference,
which is, I think, hosted by JFrog.
Yeah, it's hosted by JFrog, and it's actually my first SwampUp, so I'm excited to be here.
That is awesome.
So before we dive into MLOps, why don't you tell us a little bit about your story?
What kind of led you to JFrog?
Sure.
So previously, before joining JFrog, actually only two months ago,
I was part of Quark.
Quark is an MLOps platform, and I was one of the co-founders
and CTO for Quark that was acquired by JFrog end of June
and now basically part of JFrog ML.
So before being part of Quark and founding Quark,
I was working for Amazon Web Services for five years.
I was part of the AWS Machinery and Service team,
basically working with the AWS customers
on their challenges around machinery.
So I've seen hundreds of customers,
hundreds of companies trying to solve that,
trying to understand how they can start iterating on machinery,
building models, experimenting, and eventually impacting the business based on machining models
and AI applications. And that's basically one of the main reasons why, together with my other
co-founders, we decided to fund Quark and help those companies to actually achieve that, to make
sure that they impact their business and build models, not just for the research or for the development, but actually
making that supply chain work and affecting their production as well.
Cool. So, you know, I think one really unique story I'd love to dive into is you're at AWS,
which is a very large organization, right?
Roughly how many people are there?
How many engineers at AWS?
Maybe tens of thousands?
I've been there for five years.
And in that time, it changed from a few thousands
to probably tens of thousands
by the time I actually founded Quark.
So it changed quite a lot.
I'm not sure what are the numbers right now,
but by the way, it was amazing to see
how a company that was relatively small,
at least versus Amazon,
actually grew and became such an amazing business.
Yeah, definitely.
Yeah, I feel like the documentation,
the quality of service of AWS,
I haven't used other ones recently,
but I remember, this is years and years ago,
looking at all the different options
and just seeing just a higher level of quality for AWS.
Things just tended to work
and the documentation was really well done.
And there's an extraordinary amount of effort
and user study and things that go into that.
Yeah, I must say that I learned a lot during that time
because I've seen the company that basically founded the cloud
or the first company that created the cloud
that understood that their customers are not IT,
their customers are developers that need to build software.
And they build that as a self-service,
basically product-led development
before anyone talked about it.
And I think that's one of the reasons
why eventually we found the quack
is that we saw how that company grew.
There were so many services
and it became really, really difficult
for companies to actually utilize the different services.
And not just utilize different services,
but the challenges have grown. You cannot just utilize different services, but the challenges
have grown. You cannot
just use EC2 and S3 anymore.
You need different high-level
solutions
to make your
product work.
And I believe that it's amazing what
AWS built.
But nowadays
I think that in many cases,
using those building blocks directly
and trying to build on top of that
is actually too much for many of the companies.
And actually, the amount of investment
that you need to do to actually do that,
the amount of engineering work that you need to do
to build, for example, an ML solution,
is just too much for many of the companies.
And it's really difficult to do
that. Yeah, that makes sense. I kind of experienced this firsthand where you can stand up things with
the web UI, you know, and so it's like, I want an EC2 instance, I want a database. But then at some
point, you need to programmatically do that. It like oh i need a beta database and the beta database
needs to have all the same things and you just you just forget what you clicked on a month ago
so you need this infrastructure as code and then that's a rabbit hole and it's hard to hire people
who have that talent and so it becomes difficult yeah yeah it's it's it's actually amazing to see
how companies have grown to a place where they understand that it's not single-point solutions.
You need to create platforms.
So, for example, going to the software supply chain,
understanding how a software supply chain looks like,
it's not just a CI, CD.
It's your source code and how you manage your binaries
and then how the security infrastructure looks like on top
and then why software engineering is different than data science
and how it's all connected to the same supply chain.
And eventually, this challenge is not just, you know,
spin-up instances and databases and writing code in Git.
It's way more than that.
It's having a system that can actually work in scale
and allow your company to be efficient
both in, you know, once you have like two, three, five,
10 developers, but the same efficiency
once you have thousands of developers
and you want to actually make sure
that you deliver on time
and not just deliver on time,
but you delivered a trust for the products,
products that you can trust both in terms of the features
but also in terms of the security
and make sure that
they work in production
exactly the way that they should
based on the real product
spec, for example.
That makes sense. You went from
giant company to
Quark. I'm assuming
maybe a handful of people when it started.
Yeah, sure.
Yeah, and so...
When it started, you know, it was four people,
but of course we grew over time.
Yeah, so was there sort of a shock,
you know, the moment you went to Quark
and you're like, wait, there's no IT department.
There's just me, you know?
Like, how did you navigate that?
Because I know a lot of people who go from big to small company.
And there is a risk.
Like, there's some people where it just doesn't work out.
And I mean, that's okay.
They go back to Facebook or whatever.
But, you know, what was that experience like for you?
You know, there's always a risk.
And I think that founding a company, being one of the
co-founders or even one of the first employees
is not for everyone and that's great
because I think that the challenges
that you have doing that are different
challenges, different experiences
than working with a giant
company like Amazon.
So I personally
love it. I think it's
an amazing experience.
I love the fact that it was on me.
Like, if I had a problem with my computer,
I needed to fix it.
I needed to talk with, you know,
the person that we bought the computer from.
I needed, so, you know,
it's part of, like, the experience
of founding a company,
being part of a small organization,
a startup.
So I really love it, but, of course, it's not for everyone.
And I think that one of the challenges is that you need to make sure
that you focus on the right things, the right challenges,
because there are many things to do,
but your main goal is always to build a successful company
and make sure that your customers actually get the benefits of your product.
The customers are actually happy with the solution that they get
and want to use you and, of course, recommend you to their friends and colleagues.
Yeah, that makes sense.
So maybe before we jump into the ML Ops, what was Quark?
What did Quark do and what does Quark continue to do through the acquisition?
So, as I said, Quark was founded four years ago
and we started from day one to focus on
this challenge called MLMs. So basically allowing companies
to start from the development to the
production without the need to manage multiple
solutions, multiple platforms, and
even utilizing or actually handshaking the code between different stakeholders during
that process.
So from day one, our main focus was production, how to make sure that models will work in
production, and of course, going left or going back from production
on how you can make sure that the supply chain actually works
from the development to the production
and smoothly as possible.
So we started with that.
And I think that one of the challenges with MLOps
is that it's not just around the models.
It's around the data as well.
So you need to both have that solution for data that
allows you to, again, have
data for production, have data for inference,
but also data for training and have the way
to manage that data, those features.
The same way as you
do that for models, and of course,
nowadays, the same way that you do it for
Gen AI applications,
LLMs, things that are
a bit different by nature, but eventually are based on the same infrastructure and the same questions and answers to evaluate the model. Well, it's going to be pretty, pretty good. And so that might sound like obvious, but it's actually really difficult to not accidentally leak the label. It's so difficult to keep sacrosanct
like your evaluation set
and keep it from accidentally,
especially when you're doing aggregations
at so many levels,
from accidentally sort of poisoning the well.
I fully agree.
And I think that like having the proper data platform
for AI and machinery applications
is actually crucial
because it's not just making sure that you manage the
data set right and understand what's the difference between a training data set and
evaluation data set. And I agree that's crucial because that's the only way for you
to understand the metric of a model. It's also
understanding that the data that you build the model on is actually available
during prediction, during inference.
So, and by the way, this is
I think one of the reasons that
many ML projects fail is
because their scientists are doing research
on the data. They get the data
from the data warehouse.
They train the model, but eventually
that data doesn't exist during
prediction. It doesn't exist during inference.
And many projects fail because they created an amazing model,
but they don't have this data when the application needs it,
when the application calls the model.
The application doesn't know, I don't know,
the history about that specific persona during that time
and cannot get that because maybe it wasn't calculated yet.
And one of the things that we started with in Quark
is to create a feature platform
that allowed to create both an offline feature store
and an online feature store
that are basically based on the same calculation.
The online feature store allowed to get a low latency
current value of your features,
while the offline feature are allowed to get
all that training data that was calculated exactly the same.
So you know that there is no basically drift
between your offline or training data
and your inference data.
It's exactly the same.
It was calculated the same.
And the data that you are trained on
is actually available during inference.
Yeah, I mean, the story I always tell folks was
there was this product called Google+.
It died along with a whole bunch of other Google products,
but we were doing the friend suggest
and a bunch of ML stuff for Google+, a long time ago.
And there was an issue where the day of week feature was zero based at training and one based
at serving and so sunday didn't exist and saturday i guess also didn't exist i don't remember exactly
what happened with saturday i think it was a one hot encoding yeah so sunday and saturday both
didn't exist and it caused all kinds of chaos and it was extremely difficult to find because you just
would give bad suggestions on the weekend and and there's no compiler error there's there's no uh
easy way to find they're written in two different languages you know the trading and the serving
system so uh so yeah i think there was a product feast which was a very early attempt at this
um but um yeah how has that evolved?
Is it to the point now where you can write it once
and it works on both?
Again, I believe that the feature storage
is a crucial part of an email platform.
And Feast, by the way, is an open source that still exists.
But I think that one of the main challenges
with such solutions like Feast,
like the cloud vendor solutions
for feature storage,
that it's not just about the storage.
It's about the data pipeline as well.
You need to have a data pipeline
that eventually can connect both the offline
and the online on a single process.
So you write your code once in the same language,
and that code will be used to create the data for both. And again, it's really connected to what you just said
because in the past, many models were, for example, written in Python, but then
the production implementation was, let's say, in C or Java or
whatever, and you needed to recreate the features in a different language, and
there is no actual way to compare the two and understand that the features
were created the same.
So many companies had the same challenge
that you just talked about of having that,
you know, that difference between the production
and the training.
And again, it's the same challenge with a feature store
that are only the storage level like FIS and like others
because you need to have that same data processing layer that will be used to create both features.
And I must say that part of it is that Python now actually
grew enough to be good enough for production, because
in many cases, it's not really Python production. You build a model in Python, but then
eventually it's a C object that's running in production,
although the code is in Python.
So you can actually use the same code
and get pretty good performance.
So that's, of course, part of the technology enhancement
that happened during those years.
Oh, very cool.
So a lot of projects start with Jupyter Notebook,
which is kind of itself kind of this extension
or descendant of Mathematica,
right? Mathematica notebook, which is this beautiful, like, you know,
interactive thing.
And it just becomes really hard then to, to,
to productionize that because it's not dot pi files, it's all these cells.
And, and so how do you recommend for folks to,
to go from that notebook to something that can run at scale?
So it's a great question because I needed to talk about supply chain. And that question eventually
pivoted me back to supply chain. And I love notebooks. I think that notebooks are great
for some things. For example, if I want to visualize data, if I want to have an interactive
environment, notebooks are amazing for that.
By the way, IDs are amazing for other things.
For example, debugging code, IDs are way better for that than notebooks.
There are advantages for each one of them,
but both are not
built for production.
Eventually, when you want to build for production,
you need to have a proper CI.
You need to have a proper supply chain
that moves that code to be an artifact
and from being an artifact, eventually deploy that.
And of course, making sure that you have a lineage
between the deployment, the artifact, the source code,
all that should be connected in a way.
So what I've seen is that notebooks are great, are amazing,
but eventually once companies
understand or graduate to the place
that they want to be to production,
they need to add more structure
to the way they build the models.
And for example, one of the things
that we've done in Quark and now with J4ML
is to have an opinionated way
of how to build models in terms of structure
that allow every new data scientist to look
at the model that they never looked on before
and understand, okay, this is the training
of the model. This is the code for the inference.
This is the code that fetches the data
from the feature store. Like, immediately
understand the structure of a model,
immediately understand how that model was built
and be able to help
or train
a model that they haven't seen before
just by looking at the code and looking at
that structure. So I
believe that notebook,
it's part of the research,
but in some phase you need to move from
the notebook to have that
code managed, for example,
in a source version
solution, so let's say in a Git
solution, in a way that can actually be automated.
So you need to have some kind of structure
of what the training job looks like,
what the prediction looks like,
what the depends,
well, the depends is which packages
you need to run that model,
because packages, especially in Python,
a new version can come out and break everything
so you need to actually freeze those
dependencies to make sure that
you have that model actually reproducible
and not just
train once on a notebook on someone's computer
and will never work again
yeah definitely
and you have to be able to go back as well
like
we did these really hacky things but there's got to be able to go back as well like uh um we did this these really hacky things
but there's got to be a principled way like once you've you you have it in source control and it's
in production you're always going to want to do more data science because you are going to change
the product the product changes without you because new customers adopt and some uh uh
churn and everything.
And so being able to run a Jupyter notebook on your production code and then make changes,
that whole integration is extremely complicated
and so valuable.
Yeah, yeah.
Models degrade over time.
That's, I think, the way a model looks like
because data changes over time. So part I think the way a model looks like, because data changes over time.
So part of having
the structure, so by the way, one of
the things, for example, that we do with
J4ML now is that
whenever I build a model, I
automatically basically
copy or freeze the
source code that I used to build that model,
I freeze the dependencies that were used to build
that model, I create the model artifact,
the trained model, but I can
reproduce that model at any point in time.
I know which configuration I used,
I know what was the source code I used, I know
what is the data set I used, and I
think this is a practice that companies
must
utilize. Practices must
make sure that
they have the ability to reproduce a model
because they will need to train that model
again. They will need to
fine-tune it. They will need to
understand, even without
model monitoring, even without
really looking at the data and
seeing the drift, it probably
happens. And if it happens,
it means that they will need to
retrain that, and they will need to have the structure
that allow them to retrain that
even if the data science that build that
is not available anymore,
is not in the company
or just your team grew
and you have more projects.
Yeah, that makes sense.
So on episode 158,
we actually had Bill Manning from JFrog
come on the show and talk about software supply chains.
And so folks should definitely listen to that episode if you haven't already.
So what's the delta between that and AI supply chains?
How does AI kind of make that problem different?
So first of all, the process itself looks different. So, first of all, the process itself looks different.
Like, an ML
project starts from research,
it starts from experiments, you need
to track those experiments. Those are phases
that do not exist with
software. Usually, when you build
software, it happens because of two reasons.
First, you have a new feature, and
second, you have a bug. So, those are the reasons
why you start working on a software project.
But with ML, it's also because your data has changed.
It's also because you have more features
and you want to make that model better.
So there are many reasons why to work on an ML project,
and it always starts with some kind of research experiment.
And those phases look entirely different
in terms of AROps, MLOs.
Same for the deployed model
because the monitoring that you do for software
is only the infrastructure monitoring,
maybe logs analysis,
maybe understanding if there are bugs
and what are the latencies of different processes
and those kinds of things.
But with models, you actually need to monitor
the data as well.
You need to monitor the model and make sure that this model
actually gives the right impact on the business.
Make sure that this model doesn't degrade over time.
So the process itself looks a bit differently,
but eventually what happens in the software supply chain process
is that you have code, then you create artifacts,
and then you deploy those artifacts.
And those parts are the same with AI, with ML.
Maybe some configurations are different.
Maybe, for example, nowadays with AI,
you have also prompts as part of your artifact.
Maybe you have an external model like, I don't know,
ChatGPT or something like that, and there is a new version.
And it's not something that you manage,
but it's still an API that you call,
it's still an application that you need to manage.
So eventually it's the same artifacts behind the scenes,
same idea, but different processes.
And even the security posture,
even the understanding if a model is trusted enough,
is trustworthy, that's pretty much the same
as any other kind of software.
So if you build software and you scan those software
and you scan the dependencies
and you scan your runtime environment
and you want to make sure that the same software
that you build in runtime don't have vulnerabilities
and you scan it from the source code
and up to the right time
and understand how your security posture looks like,
it should look the same with machine learning
because eventually those are packages, those are artifacts and third-party APIs that need to be
secured and need to make sure that you manage that lifecycle in a way that you trust. Yeah,
that makes sense. This was awesome. Thank you so much for your time. This was really great. One
last thing you have for students who want to get
into this field. What's some advice you could
give them? So, first of all,
it's an amazing field, so do it.
It's great advice. But
second is that I think
nowadays with all
the Gen AI models and LMs,
the entrance point
is actually easier than
in the past. Like in the past, you needed
to understand the statistics around the data
and you needed to add this data to
start training models. And there was quite
a lot of understanding, quite a lot of
work you needed to do to actually have a model
ready. And nowadays you can
start understanding that
just by using, for example, ChatGPT.
I think that as an example,
my father only now understands what I do
because he understands ChatGPT.
He saw what it looks like and what are the effects of models.
So I believe that this is a great starting point
for every student that wants to start learning
about machining, about AI, play with those tools.
Understand our specific prompt change,
change the way the model reacts,
and eventually, those are models
that are maybe more complicated
than the models that we've done five years ago
that we use for specific point solutions.
But eventually, those are models
that use the same technologies behind the scenes.
And understanding how those models behave
will give you quite a lot of
knowledge about
what the development looks like
and how you can utilize machine learning
for every kind of task.
Very cool. Thank you so much, Yuval.
I really appreciate your time.
Thank you very much.
All right, everyone.
So we have been talking in this episode
about ML Ops. And I'm really lucky that we have Steven Shin on the phone, who's done a lot of really interesting work with this. And we're going to focus on LLMs and GraphRag as kind of two case studies of getting ML and AI implemented, and then what that whole process is like,
and then getting all of that into the hands of customers. So welcome, Stephen. Thanks for coming
on the show. No, very glad to be here and excited that I'm speaking at SwapUp actually on the same
topic. So I'm going to be talking a bit about knowledge graphs plus LLMs, and in
particular, how you can apply them to your DevOps pipeline. Cool. So before we jump into all of that,
why don't you give us a quick background? What was your path that kind of led you to where you are
now? And what are you doing right now? Yeah, so maybe we'll start with where I am and go backwards.
So I'm VP of developer Relations at Neo4j.
Neo4j is a graph database
company, but also does a lot
with
generative AI and
machine learning and
is building out architectures
for GraphRack that a lot of enterprises
are using. Before
this, I was working at JFrog
for basically the same role, VP of developer relations.
And I did this Oracle, slightly different role, but basically the same role.
I was running the developer marketing team. And the way I kind of got into developer relations
in general, because I was also a developer advocate for a bunch of years is um i made the
mistake okay of writing a book oh okay um now i'm not discouraging anybody from writing books
it's a wonderful way to increase your reputation to like like teach and explain things which you're passionate about to a larger audience
um and when i wrote the book i i was um fortunate enough to have a some great co-authors
who i collaborated with on the title um it's a lot of work you're basically like giving away
six months to a year of your life where you have no weekends no evenings especially if you have a
day job that you also have to keep going um and then when you finally finish the book the publisher
says has oh that's a wonderful book we've released it on you know amazon bookstores etc but
can you help us promote it oh you become the promotion arm for the book like if you're not
submitting to conferences and talking on a topic if you're not book. Like if you're not submitting to conferences and talking on a topic,
if you're not on social media,
if you're not like out there being a vocal advocate for the technologies you
care about.
And,
and you know,
obviously,
you know,
pointing out that there might be a good book that folks can read for more
information,
then you're not doing your job.
Now that,
that first book I wrote,
Oh my God, it must have been
i'm dating myself like 15 years ago so you knew a technology and and the uh a publisher approached
you and said you know steven you uh are really gifted in this technology why don't you write
a book on it that's kind of how that went down yeah and actually that when i when i was in um college i assumed it was the opposite
i assumed like you wrote a book and you went to publishers and you said look at this great like
book that i wrote would you want to would you want to would you want to put this onto your
brand and publish it but no no publishers especially tech publishers work exactly the
opposite they have a a roadmap these are the
titles we want to publish these the topics that matter to us occasionally you can influence that
but not until you're an established author you have really good relationships with the editorial
group and they then say okay for these for these books we want to author for these titles we want
to author who who would be a good
candidate like who has the expertise the who can authoritatively write a book and and help us
market it later on and so you had the footprint then that they were able to find you so you had
done some kind of promotion to get to that point yeah so i mean it was for java effects technology
and i i got in really early, like beta days.
I was already building applications with it,
had a lot of good connections in the JavaFX community,
and I got invited with somebody else who'd written another book
with the same publisher, and now this was the first real JavaFX book.
Now, fast forward 15 years later, and we got asked to update basically the same title
for Java 21 and 23
with all the latest features and capabilities.
So it's become, for this small niche market, JavaFX,
which I don't actually do professionally anymore,
but I still keep up on it,
and I'm very close with the community.
This has basically become the authoritative guide
for client developers in Java.
Wow. I'm totally dating myself here,
but the last time I professionally wrote Java,
it was this thing called Google Web Toolkit.
Oh, yeah.
And you would create JavaScript with Java.
This was one of these, in hindsight, in my opinion, really bad ideas.
I mean, maybe not.
And I was writing, I think, in Java 6 or something,
and it was compiled to JavaScript.
It's kind of a really strange...
I mean, that was my first introduction to Java.
Yeah, GWT was kind of an interesting approach.
I think we've come a long
way there with JavaScript frameworks
which do all the
heavy lifting. So basically you can
have a very feature-rich
application without a heavyweight
backend. So I think that's
become the modern development framework.
GWT was an attempt to
do that,
have the heavy front ends,
but then have you write it entirely in Java and then deploy all that JavaScript magically
and have the web application just appear.
It's technically difficult to do that perfectly,
and therefore it couldn't keep up
with modern JavaScript frameworks.
Yeah, and then on the other end, it really got
squeezed by Node because the advantage
of GWT was, for example,
you could write validators
in Java and validate the client
side and the server side.
But now you can do that with JavaScript.
And again, for reference,
the Google team
accepted and endorsed the use of GWT and GWT for the
same acronym so they would use both ah yeah they never agreed upon what it was supposed to make
sense um cool so okay fast forwarding all the way to to current time so you are actually before we
dive into LLMs let's talk about this a a little bit. So what is a developer relations advocate and someone who leads a team of advocates?
What is that job?
Because we're going to dive into a lot of technical content here.
And when people think of public relations, they think of speeches and writing speeches
for candidates, these kind of things.
But developer
relations, you know, you're actually building a lot. So why don't you kind of give people a little
bit of scaffold there? Yeah, okay. So for those folks who don't know the job or role of a developer
advocate, basically, a developer advocate is somebody who they're advocating for developers, both in the product.
So if you work for a company like saying,
hey, we have these users
and they're trying to do things with their product,
like let's actually help them out and build features
which are going to be beneficial to them.
But then also educating developers
about new ways of doing things, new techniques,
and kind of upskilling, helping them upskill.
And the way I would describe a developer advocate,
the most simple description is a developer advocate
is a geek with social skills.
So you have to be technical, you have to be able to write code,
be up with the latest technologies and trends,
and constantly learning, like picking up new
technologies but you also you need to be like able to present able to do interviews like this to to
kind of be very fluent so strong english skills like strong presentation skills those are all
really important and um you can actually come up to be a developer advocate
from either a highly technical role,
like a lot of developer advocates start their career
as programmers and architects,
and they get tired of just building things.
They want to move up and actually be the change agent
for industry and for folks who are adopting technology.
So it's a great career path past,
like what do you,
what do you do when you're tired of just being the most technical person as
your company?
On the other hand,
you can also become a developer advocate by having great social skills and
great kind of English language presentation skills.
And one of the folks who I just hired,
their name is Naya Macklin,
also a speaker at SwampUp,
got accepted before even joining my company.
And they came up through a background of journalism
and politics, kind of being involved in political campaigns,
being involved in kind of all of that writing,
outreach, campaigning, and wanted to move
to a more technical role, wanted to move to something which was more technical. So they went
to a back to a boot camp, kind of learned Python, JavaScript, all the basic skills, you know, all
the way up to, you know, building web applications, like deploying technologies. They were developer
advocate at Couchbase,
now a developer advocate at Neo4j,
and very early in their career.
I would say that to be a developer advocate is not something that you have to be old and gray
to be a good developer advocate.
Some of the best developer advocates,
actually most of the best developer advocates,
mirror the audience.
So if you're talking to a technical audience, you want to be
able to just talk to them as peers. If you're going to a university and speaking to them,
you want to be able to talk as a recent graduate. Kind of like being very close to the audience,
I think, makes it more credible and makes you more effective in the role.
Yeah, that makes sense. Very cool. Yeah, so folks out there, this is one of many
really interesting professions that we're going to learn about here in these sessions.
So if you have any questions about this or any other professions, you know, don't hesitate to shoot emails to us, post in the Discord.
Feel free to kind of get that, keep that conversation going. There's a lot of really on discord um and uh feel free to join and be a part of that um okay so let's dive into llm so llm
stands for large language model i think now they're starting to call them foundation models
because you have vision and all these other modalities um but you know what is a large language model how would you describe that
somebody yes i i think that this in the past has kind of required a technical explanation
of like you know how you train the models and then how you how you can actually um
build the models to to learn by feeding them extremely large data sets and then having them kind of iteratively complete
the next word vector or or kind of the next idea in a chain of commands now actually it's much
easier to explain this now because everyone's using it that's true yeah so if you're if you're
using chat gpt if you're using copilot if if you're using any of these tools which kind of give you
a language interface to talk to and interact with your code,
with the web, with an enterprise dataset,
then you are using an LLM behind the scenes.
And you can see that it's a very effective tool for a lot of tasks,
which for humans are time-intensive and require a lot of knowledge entry,
which require a lot of knowledge gaining.
So it's great for summarizing information,
great for writing emails.
Excellent.
Don't recommend this at home
for writing research papers.
Not research papers,
like class papers.
Things which the professors say,
oh, research this subject
and then write a two or three page paper.
It's amazing at that.
But of course,
completely against most schools' rules.
That's right.
Don't violate your school policy.
It's really good at portmanteaus.
You can say, give me a portmanteau of these two concepts,
and you will come up with some incredibly brilliant names of companies.
It's really good at that.
Now, I would say what LLMs are poor or they don't have the a strong ability for
is um in general they they don't reason like we do yeah so if the source material if the context
is rich enough to to kind of piece together and give clues on on what the answer is it can both
kind of pull from that body of knowledge, but then it can
also synthesize information, which maybe is not clear from the very large, like they feed these
systems with hundreds of billions of words and like huge unstructured document sets. And so it
does kind of these amazing leaps which which seem like
reasoning but they're not actually reasoning as humans think and reason and um also at a related
thing which they're poor at is is math right so in general like it's not you know they work as
simple calculators and can answer basic math questions because again like
that's all exists in the source material or they can specifically train the models and add in things
for common questions which get asked so they they do a good job of calculating and returning a result
via going to an agent or some other system which is specialized but in general they're not designed
for math and if you give them a complex problem like I was playing around with one of
these kind of online
LM games where you're supposed to trick and
hack the LM. Oh that's a thing?
And it was kind of fun
like they set it up as
the LM thought it was
a wizard
and like it was protecting
like some secrets and you were supposed
to like convince it to protecting like some secrets and you were supposed to like convince
it to give you the secrets and um basically they had multiple levels for difficulty where they add
additional prompts which would prevent you from doing attacks which um will allow you to circumvent
the lm but basically the way to the way to hack it is you did a combination of reasoning
and hard math problems.
So you'd ask it to,
for example, do like a rough
13 algorithm or something complex,
moderately complex,
and
basically you ask it to give you an answer,
apply a math algorithm to it,
and then the system which is checking the answer
to make sure it's not revealing secrets now can't check the answer properly because you've encrypted it ah i see
but when you look what you learn when you keep giving it harder and harder and harder math
challenges is um it actually falls apart like it for example like rotation ciphers it'll consistently
get the first few letters right it gets worse and worse and worse as it goes
along because it gets lazy and it doesn't
really care about the answer.
I noticed if you tell it it's wrong,
it'll get better. It's almost kind of
like reinforcing
an animal or something. It's like,
oh, I'm so sorry, you're young, it'll get a little
bit better.
And then the ultimate answer was to
actually give it small snippets of code
to generate or compile.
Because again, the web is full of so much code.
And those systems are also tuned for doing a certain amount of code challenges and problems.
So the hardest reasoning problem you can give an LM actually is to do coding.
So you combine that with like asking for some information or like trying to hide some information.
You can actually trick it to do quite amazing things oh interesting so okay so you know i think that yeah when people
think of lms a lot of people have used chat gbt perplexity these other things and that's a pretty
straightforward um from from like a product standpoint use case where you go to chatgbt.com
and you're literally just a blank screen
with a text box.
You type what you want.
And so it's the simplest product you can make.
And it's because their technology is really impressive
and they just want to focus on that.
What are some other places,
you know, maybe less obvious
where LLMs are being used in the real world?
Yeah, so I was mentioning things which lms aren't good at and an additional one which lms are not good at is um since they're
trained on mostly public data they really know nothing about enterprise systems right nothing
about ask it about your email things in your email for example yeah yeah and like similarly if you're if
you're in a company which does um you know supply chain management of like parts and things for
aircraft you can't ask it about like what's the part number or what's the what what parts do you
need for a certain maintenance operation on a plane because that's just not something that's
generally available to to lMs to be trained on.
So one of the techniques you can do with LLMs is you can do something called retrieval augmented generation,
where you feed in a body of this additional knowledge, additional information from,
could be from a database, from a bunch of documents, from some other source which is not public.
You embed it in a vector database.
And then when you query the LM,
you first query the vector database, the vector store.
You ask it for information which relates to this. You pass that on to the LM as context.
And now the LM,
which is very, very good
at answering abstract questions,
now becomes an expert
in this new data set
and this new knowledge set.
But with kind of the same limitations
where it's not good at reasoning,
it's not good at a whole bunch of things.
And this produces another problem,
which is,
so getting back to the the aircraft
example let's say let's say i'm a technician i need to perform maintenance on the the fuse lodge
of an aircraft and i want to know like for for this particular aircraft model what what part
number do i need to do this repair which i'm trying to execute. Now, if it's not exactly in the source
material, if
maybe the relationship
between the maintenance operations
and the parts isn't clearly spelled out,
the LM won't say,
oh, I don't have that information.
You should maybe talk to somebody
who's done this before. What it's going to say is
it's going to say, oh,
well, if you're doing this sort of operation,
there's a number of different things which you could use for this.
And I'll give you a list of different parts which are totally plausible.
And I'll tell you any of these would work.
Just go ahead and...
Oh, man.
I think you're explaining a lot about like Boeing doors falling off airplanes.
I hope they're not using theMs for their maintenance on Boeing planes.
But you can see that
it's for enterprise use.
You can actually get pretty far
with an LM, but
it has problems with
what we're currently referring to as
hallucinations.
Basically, the model will
extrapolate information, which
may or may not be true.
And depending upon how good the source information is,
how good the encoding is,
you can get to some customer support systems,
they get to maybe 60%, 70% accuracy
or accepted answers
when a customer support representative looks at this
and says, is this the right answer to give a customer?
But the question is, when you get to that accuracy,
let's say you get to 80% or 90% accuracy,
is that good enough for aircraft maintenance,
for supply chain management, for fraud detection?
There's a whole bunch of critical use cases.
Is that good enough? Is that accuracy good enough? and then the second question is if when it's wrong how do i know it's wrong how do i
how do i check the system how do i explain how it got the results so um this this is actually the
subject of my talk which is there is there's another way which you can um encode enterprise information, which has been in use for a while.
It's called knowledge graphs or property graphs, basically the same thing.
And a lot of expert systems will use knowledge graphs as the system of truth for capturing
data, kind of building expert systems.
But the typical problem with those is to build an expert system
on top of a knowledge graph, you need an interface.
So you need to build an application,
you have to have a bunch of queries and drop-down menus
and things for people to find information.
So wouldn't it be nice if you could ask the LM,
but then have it give you information from a knowledge graph
instead of just randomly pulling
information out of a vector store and so this technique for pairing knowledge graphs and
lms together is called graph rag it's a really effective way of improving the accuracy of your
results there was a study recently done i believe by by gartner where they
showed a 54 increase in accuracy just by switching to knowledge graphs versus like traditional vector
stores oh okay um it makes it easier to explain the results because part of the knowledge graph
gets passed into the lms context and knowledge graphs unlike um vector databases actually you
can you can reason about them you can see what are the nodes or the actually you can reason about them.
You can see what are the nodes, what are the relationships.
You can actually start to understand why the LM was giving a correct or incorrect answer and then go back to the source data and fix it.
So is the LM translating your English request into a knowledge graph query?
Is that how it's working?
Or is it just more fundamentally integrated with the knowledge graph?
Yeah, so it's doing
a couple things.
So when
you take the source material and you put it
inside of a graph
database that also supports vector search,
it's
doing a standard vector encoding
using word vectors of the
embeddings. It's also
using an LLm to create a
graph and you can you can you can hand make a knowledge graph you can also tweak the knowledge
the resulting knowledge graph but a really quick way of doing this is to actually use an llm to
generate the knowledge graph off the source material as well and once you have a knowledge graph plus the vector database and linkages between
them, now you can
basically feed
it into the vector store, the question,
get back some
embeddings,
see what nodes they're
associated with and pull the related
nodes, and then feed that
all as context into the LLM.
And so basically it's a better
vector search because it's vector search which includes knowledge graphs and real data coming
from the lm yeah that is super cool okay so um to have folks deployed graph rag uh you know
different industries and and tell us maybe a good story,
bad story kind of thing.
What's something
maybe hilarious that came out of it?
And what's also something where at scale
you had an aha moment or a eureka
moment?
I think a lot of our customers
are using GraphRag and
graph databases specifically to solve the problem
of getting higher accuracy on RAG.
Some of our customers are using it for,
in particular, customer service systems.
That's one of the common scenarios.
A second one in general is recommendation engines.
Oh, that makes sense.
So trying to give back better
search results, better recommendations
to end users.
And actually, we had an interesting
use case
where
this is a different
use of LLMs. It still
generally falls under GraphRack, but it's
more of a research system than it is
a query system
where um one of our customers um maintains oil fields and kind of that oil infrastructure
and um you know making sure that the the supply chain is uninterrupted is important for them
but there's so many data points in terms of
conditions and maintenance issues and weather conditions that it becomes very hard to even
understand root cause analysis on why things are getting delayed or slowed down so what they did
is they they fed huge amounts of data into an llm using using a using a graph representation of this,
using a graph store,
and then had the LLM start to reason
and give some potential answers
about where the issues were with outages
or supply chain issues or things in there,
and got some interesting insights.
Now, it's a humongous model, very slow to load up the massive quantities of data and
things which they did.
But from a research standpoint, they got some really valuable insights, which would have
required a humongous amount of human manpower and research to actually go through the data
and build those insights.
So I think there's a variety of different use cases for it.
And I'd say the biggest challenge,
and those are all successful production-ready systems
I talked about, but the biggest challenge
in the white elephant in the room
related to LMs and, in general,
Gen AI architectures,
is it's all startups and very little is in production.
So when you actually get down to it and you're like,
okay, well, how many people are using Langchain or Alama
or these models in production?
And they're like, oh, we have a really promising system
and we're getting the accuracy up and we're
almost there. But I think a lot
of folks are almost there and
are kind of looking for
the technology to mature.
And also
for the cost to be reasonable
as well. I think that
it doesn't...
Doing research and doing development
in LLMsms it makes sense because
the technology is not that expensive to prototype with when we were doing at scale large volumes of
data a huge amount of resources and processing and gpus which you need to to then execute the
the building of knowledge graphs or in general building of RAG and search on RAG,
that can be quite expensive.
So I think also enterprise use cases are a great place
where the benefit-cost trade-off makes a lot of sense.
If you're just doing like this for a general consumer system,
you don't want everyone just firing off queries randomly
unless you have a huge amount of capital like OpenAI.
But if you're in a corporate system where it's customers doing queries
and they're solving business problems, then it makes a lot of sense
that I can pay the token fees and everything,
but then it's saving me and my customers time and money
because we're able to get answers faster and easier
than we would if we were going
through a human workforce that makes sense and so just uh kind of one last question on the theme
of ml ops you know if someone wants to uh you know go into the ml ops kind of profession um
you know what is your kind of uh best advice for them um you know let's say they are um um
you know they they're they're just finishing high school and they have a choice between
you know getting a four-year degree going into a boot camp should they try and get a degree really
kind of specific on on uh you know uh um it ops or just a you know general computer science or even a general math
degree how do you feel like folks should kind of navigate that yeah no i think that it's there's
there's a lot of great options for like technology degrees um just despite what what folks say
including shulmi benheim in the keynote stay SwampUp, developer jobs aren't at risk.
I think that AI technology continues to build
and create new opportunities and new technical challenges
that you need really smart people to solve.
And my best advice for folks would be
the real skill set you need to learn in college
is how to solve hard problems.
And so
if you feel challenged, if you're in a
degree program or if you're taking
something where you feel like
you're learning new material,
you're able to
use
your expertise to solve problems,
to reason about things to to make
a difference that skill set the the ability to understand pick up a new challenge reason about
it lms don't reason humans do right and and actually come up with some creative solutions
that's what's going to be the valuable skill set going forward. And maybe we're not going to be sitting there
and coding basic algorithms for doing list sorting
and all that stuff in the future.
I mean, hopefully people aren't doing that
other than like programming one-on-one
because machines are better at that
and they're better at optimizing those things than we are.
But hopefully we're the ones
who are actually taking the real world problems
figuring out how to solve the hard problems and actually ml ops is a great example of this because
figuring out how to observe how to secure and how to make sure that machine learning models
are being taken from the developer all the way through to production is a hard problem and a
machine isn't going to solve this for us this is something which you need people who understand the
problem they understand the space they can do it actually even my talk here at SwampUp is about
using one of the tools in that tool chain Artifactory which you can use as a model repository
pulling information out of it feeding it into a knowledge graph
using the Neo4j Knowledge Graph Builder.
And in a couple hours on a weekend or an afternoon,
you can basically have your own LLM
with enterprise information
from your machine learning pipeline
and then start asking questions like,
oh, which dependencies have an MIT license on them?
Or what's the latest version of this library?
You can start asking your own little system
these questions.
Cool.
So I think that it's an exciting time
for folks who are in technology professions
because if you're able to learn
to reason and to create and understand new problems and new challenges, then you'll have pretty much work no matter what field or degree program you go into. totally agree um cool it's awesome having you here um at at a swap up here at austin and uh
i look forward to chatting with you later on thank you so much for uh entertaining the folks
yeah no thanks for having me on the show and i'd love to join the um discussion on discord as well
cool that'd be great we'll look forward to it Thank you. And share alike in kind.