Programming Throwdown - 176: MLOps at SwampUp

Starting point is 00:00:00 programming throwdown episode 176 ml ops at swamp up take it away jason all right folks so this is going to be a bit of a different episode than folks are used to. I'm actually at the SwampUp conference, which is hosted by JFrog. And, you know, a lot of people actually probably the number one email or message that we get is, can you explain DevOps, MLOps? Can I get a job in these areas? It seems like a really good, people see it as a really good gateway to getting into tech. And we get tons of requests for it. And so I had this amazing opportunity to talk to four different people with very different backgrounds who all kind of ended up in this discipline and can share their stories with us

Starting point is 00:01:02 and also explain the technology, explain what DevOps is, explain what MLOps is. And so, you know, we're kicking this off with James Morris from Cisco. So thanks for coming on the show, James. No, thanks for having me. Cool. So why don't you kind of tell us, actually, let's start off with, you know, what do you do at Cisco? What is your title? And what are you responsible for? Sure. So probably my title is technically DevOps engineer categorized right now under software engineer, which

Starting point is 00:01:33 is a perfect segue because of the blurred lines versus everybody's, you know, constant battle like you're referring to. But yeah, so DevOps engineer for a team currently called Enterprise DevOps as a Service. So we actually provide DevOps services to other teams within Cisco, try to ease that segue for them, whether they aren't quite on a DevOps path or RER, just need specific services that fall under that umbrella. And just supporting those systems and helping those teams.

Starting point is 00:02:03 Cool, that makes sense. So give us a little bit of your background. Like, how did you get into where you are now? Sure. Yeah. So I went to Guilford Technical Community College, which is in North Carolina, and got a degree in both computer information systems, sort of general IT degree, all kinds of stuff under that umbrella,

Starting point is 00:02:22 server administration, Windows server, Linux systems, and then also decided on the parallel path for getting the networking technologies degree, which coincidentally was most focused on Cisco hardware. So routing, switching, kind of think of the path for getting like a CCNA certification. Now, what is CCNA? Cisco Certified Network Associate. They've kind of floated around. It's just

Starting point is 00:02:46 Associate or Admin. There's a few different and then there's CCNP for Professional. Different layers and, you know, kind of starts at the bottom and I think now

Starting point is 00:02:54 they've split it into CCIND but I've long since been in that so I'll embarrass myself as an actual Cisco employee now but yeah,

Starting point is 00:03:02 it's the entry level at least at the time for me which I did get that degree in the past was, again, your basic like setting up as an actual Cisco employee now, but yeah, it's the entry level at least at the time for me, which I did get that degree in the past was, again, your basic like setting up

Starting point is 00:03:09 routers, which is under Cisco's umbrella. So, you know, learn how to, you know, do subnetting by hand and do it literally

Starting point is 00:03:16 on a Sharpie and a dry erase board. Oh, wow. Yeah. I was pretty intimidated, but you learn a lot and it's a lot that still helps me today,

Starting point is 00:03:24 you know, configuring systems and having that understanding of IP communications and setting up different subnets and gateways and stuff. Cool. And so at some point you were like a system administrator and then you transitioned to DevOps. So what is the difference there?

Starting point is 00:03:44 What is that direction? and then you transitioned to DevOps. So what is the difference there? Like what's the... Yeah, so when I first started, I graduated around 2009 and then quickly was placed at a company doing title was VoIP support, something along those lines. And it ran the gamut. It was pretty much somewhere between support technician,

Starting point is 00:04:03 taking first level calls, but also because it was a smaller company doing what traditional sysadmin work was then so that was something you know around the systems were all Linux based they were CentOS at the time and you know on top of that you had different applications installed for you know running various things mostly voice over IP software. So all kinds of system administration at that time, like mostly at that company, it was, they were wanting somebody

Starting point is 00:04:30 with a Linux background. So Linux sysadmin, typically, and probably still to some degree today, sysadmin work is usually split somewhere between like you're specializing maybe in Linux, Unix,

Starting point is 00:04:41 all those types of systems to the more Windows side, Microsoft side, Microsoft Server, those types of things. Obviously now it kind of runs the gamut because you have cloud services and things like that. So you could still technically maybe be system admin or some type of cloud engineer and not really be a true DevOps or DevOps engineer.

Starting point is 00:05:01 The transition there was, I think, about a year into that company was when I started hearing the buzzwords about cloud. Like the CEO I worked under was very much trying to, you know, keep his finger on the pulse of things and like to be, you know, very bleeding edge as far as any kind of new trends and try to, you know, capitalize on that,

Starting point is 00:05:18 help people that were trying to maybe get to the cloud. What did the cloud even mean at the time was sort of, you know of almost the DevOps of now. People who are in the industry, they're in cloud. Now maybe you might hit somebody who's not in the industry and kind of get what that means.

Starting point is 00:05:33 But I rarely run into somebody who's not in the industry who's just like, oh, I know what DevOps is. I'm right. So that was sort of the DevOps term then. It was the buzzword. So that very quickly led into

Starting point is 00:05:44 some blurred lines of what I would consider CICD, continuous integration and delivery or deployment. And that wasn't still true, like what I would call DevOps, but it was starting to go the route of more automation. So less like, okay, I need to deploy something. That might involve opening up something like WinSCP and copying some files over at the right time.

Starting point is 00:06:12 Okay, we're going to deploy on this Sunday at some weird hour, 11 p.m. And then copy these files over, maybe restart some services. Hopefully you don't have a lot of those. If you're a smaller company, maybe you have to do that, you know, in some for loop for, you know, hours if you do have a lot. So that transitioned to things like, you know, a lot more scripted stuff.

Starting point is 00:06:37 Things like Ansible were starting to come up where you can just, you know, write some playbooks, run those playbooks, and those things would happen automatically. And so that sort of blurred the line for me, at least, in getting into DevOps, where it was more, okay, you've got your code and hopefully a good SCM, something most famously like GitHub, and maybe I have a branch, something like production or main, something along those lines. I'm working in my feature branch, and I get

Starting point is 00:07:03 merged into dev and that dev eventually maybe gets reviewed, hopefully gets reviewed by one or two people and so on until it maybe is slated to be merged for production. And then in my idea in true DevOps,

Starting point is 00:07:20 there's at least at some point where you're going to merge that into your production branch, your production branch, whatever that's called. And a lot of things are going to start kicking off automatically. So instead of this guy who's just waiting for whatever time, he's just frantically logging into systems and clicking and dragging. If it's, you know, UI based, you're running for loops and, you know, something like a Linux CLI or terminal,

Starting point is 00:07:42 it's going to, it should be a lot more automated and maybe have some checks and balances. So if this thing fails, don't do this and kick off an email or maybe a lot of people use Slack or things like that. We use Cisco, we use WebEx, and maybe it posts into there that something succeeded or it failed. But there shouldn't be a whole lot of manual if not any in a really ideal situation manual intervention or you know manual with some

Starting point is 00:08:12 people deemed click ops where you're just you know in there manually fixing things or changing things that makes sense so would it be fair to say like system administration is is it fair to say that it's like the part that is left over because it's either not worth automating or you just haven't automated it yet. And DevOps is this automatic tool chain that covers the rest. Is that like the right way to frame it? Yeah. So there's, and that's why a lot of people, you know, kind of get in this philosophical argument about, you know, there's so many so many things that came about around the same time.

Starting point is 00:08:47 So when I first started, all the things now that sort of fit together, like DevOps, CICD, even agile development, weren't barely even being talked about. I mean, again, having some light buzzwords like cloud, some of those have been being pursued from an idea, but not with that phrasing or those terms. And literally within my career, watch it actually evolve into those things where people are using those terms daily. So a lot of those things in my mind get blurred because they sort of all came out at the same time. Although technically they're completely separate.

Starting point is 00:09:23 You could be doing Agile for example, but not actually be using any DevOps services. I can't imagine that world. I've never seen it in person. I've seen maybe DevOps not being used ideally or have some pieces missing that would make it better.

Starting point is 00:09:40 Similar to CICD, you could be technically doing infrastructure as code, like something like Ansible, and maybe even doing it with CICD or like the Git flow that I was talking about, but not really still be doing all the other parts that come with DevOps.

Starting point is 00:09:56 So they don't have to go together, but sort of the complete package in my mind is doing something like using an agile development and then kind of pairing that with CI CD and DevOps workflows. But in general, the idea is, of course, that you're using like the most common tools for DevOps would be something like GitHub, GitHub actions, right? So now I've that merge, I was talking about that's getting kicked off maybe by an actions workflow. And that actions workflow just means like, for example, if I'm using one of the examples earlier,

Starting point is 00:10:28 you know, SwampUp, I think in one of the trains they were using, it's pretty common, Node, Node.js, NPM. So that NPM build might be kicked off actually by action. So somebody sitting there actually running all these NPM commands and then, you know, pushing that somewhere, you know, maybe like JFrog Artifactory and then some system,

Starting point is 00:10:47 then again, pulling that down to be in production. You could do all that manually and be still using those tools. So DevOps isn't just using those tools, right? CICD isn't just having that flow. Again, there's this chart that usually comes up or visualization rather, where it almost looks like,

Starting point is 00:11:04 imagine an infinity symbol where you know as you have that deployment they you want that immediate feedback so as soon as you've got a deployment and you're already sort of kind of on the path to the next release because it's sort of this you know smooth transition right so you can have lots of minor iterations and sort of break the barrier that existed when I started with the traditional C7. I rarely talked to any developers unless there was a problem, right? Systems down or whatever. That was sort of the typical MO versus, oh, I just need maybe some change or some help on actions. There's a problem.

Starting point is 00:11:42 I need new permissions or something but if they're given all the right permissions and they've got you know managements and checks and balances on things like you know pr reviews and things to get into that automated production workflow then that's where that devops term in my mind kind of comes out because the the developer side the development side and operations ops are sort of blended because as a developer, even though maybe I don't know too much about the kind of stuff I used to do, Linux administration, Linux OS,

Starting point is 00:12:13 and how these files get where they need to be. Do they have executable bits? I'm not worried about all that. I just know that if I merge this PR to main production, whatever it is, that all these things kick off. And if there's a problem, I don't understand, maybe I call, you know, DevOps support or whoever it is, maybe

Starting point is 00:12:30 it's even just a technical lead or somebody that knows, like, oh, that error just means this, you know, start a new feature branch and fix this and try it this way and that. So it kind of takes that out of the aspect, at least in that flow. It doesn't mean that nobody's setting up those systems

Starting point is 00:12:45 that are running it, because that's kind of where my current role as a system comes in is like, we're deploying things like we're here at JFrog Swamp Hub. My main role is managing, deploying, and maintaining Artifactory itself. And that's still, in some cases, on a server. We both have a server in the cloud

Starting point is 00:13:02 and all kinds of different deployments. Those still have to get deployed by somebody. But once all those things are kind of interconnected, they allow that continuous workflow for that developer to not have to worry about those things. Yeah, that totally makes sense. I remember like, I've seen this through my own career. I mean, I've mostly been kind of on the research side,

Starting point is 00:13:22 but I have, my wax wings have put me a little bit too close to the sun every now and then i have been burned by uh the production gods every now and then um as we go and uh yeah and uh you know in the early days they would earlier days of my career they would you know build a new build every week and it would be a manual process and there was an irc channel and you would just get pinged like hey you know you made this change and this file you know doesn't compile you know go take care of it um and then it kind of as you said it's become more and more and more automated to where um you know most companies are just pushing new versions all the time

Starting point is 00:14:06 and and they're even automating the the failure so so it's like the contingencies are all automated and so now what you're doing is is just orchestrating this self-healing machine which is really impressive to watch um and to to be a part of so now it's as you said it's more like you get a slack notification uh hey you know this this file that you changed that you know you is actually being used in this other system you might not even known about and it broke it and so you need to revert it that kind of thing but it's it's very automated now it's really cool absolutely i think automation is a huge part of it for sure um and then it's it's very automated now, which is really cool. Absolutely. I think automation is a huge part of it, for sure. And then it's also the sort of the interlinking

Starting point is 00:14:50 between those automations. Because automation's predated DevOps and the terminology there. But I think the true thing is, kind of going back to that infinity loop diagram or visualization, whatever you want to call it, that a lot of people bring up. And I've seen it a couple times even here this year

Starting point is 00:15:05 and last year. I was lucky I was going to small last year. And similar, it's that how they're all intertwined now when they're all playing together well and then they've been configured right and you're taking advantage

Starting point is 00:15:16 of all the modern features. You really shouldn't have to be doing much unless there's, again, there's a problem. A lot of cases now a lot of what has been showcased um in both the last year solve up and this year the big focus on security so there's the term now devsecops um and several similar type terms but the idea that somewhere in that flow that we were just talking about something should be scanning right so right

Starting point is 00:15:42 some failure may not actually be that somebody technically did something wrong they just did something they shouldn't have you know maybe there was a secret in something maybe there was an old version used for something going back to that npm maybe they used a version that's going to vulnerability in it so you want to see that fail so somewhere in that pipeline you either have a plug-in or some version of you know if it's good of actions maybe there's an actions for, you know, like J frog, but if we're here, they have their own called frog bot that

Starting point is 00:16:10 has a lot of integration with artifactory and x-ray where, you know, it's making sure those vulnerabilities don't exist and you can even have it, you know, block

Starting point is 00:16:20 the pipeline or maybe just email somebody if it's, you know, the type where maybe it's just a development flow, you don't want it to block, but you do want somebody to know, hey, this should not go to production. That makes sense. What about like, you have this really expansive background in DevOps.

Starting point is 00:16:36 And so how do you see MLOps from your perspective? What is the sort of delta there? What is the part, like, what's the leap from one to the other? Yeah, it's a really interesting one

Starting point is 00:16:50 because I have, I've been learning a lot. There's been a lot of focus on that with this small bump. So, I've learned a lot. The keynote was very good

Starting point is 00:16:57 in that. I think they did a good job with their diagrams on literally showing those two differences. So, for my background being very, since I was more sysadvent, like a lot of people, it's interesting, especially with the term.

Starting point is 00:17:10 So people come in more from the development side, want to see more operations. So they get into DevOps from that. Obviously, I'm the other boat, which is people with sysadvent backgrounds or some version of operations getting in. And then, you know, the development side kind of comes in there. So not having much background with the um on the ml side or ai i can at least say that what i've

Starting point is 00:17:30 learned already and even with within today's uh talks is that there's gonna be a lot uh different word flows so there's a lot more parts and pieces with everything i was just talking about you're not just gonna have like oh i've got my good repo, I've got my code. I push it here, it runs some commands, it builds stuff. And eventually, as long as it gets enough actions or if you're using Jenkins

Starting point is 00:17:50 and you've got the right Jenkins plugins, things going to get my app deployed. Not to say that's simple. And to downplay some of the more advanced apps, when you're talking ML ops, there's a lot more at play. Like, am I using a public model from something like Hugging Face? know something being developed in-house completely um from scratch

Starting point is 00:18:09 and have the resources for that um obviously there's so much out there that's a lot less common but still plenty of people especially depending on how sensitive the work has gone so there's so many more moving pieces but i think the idea is still the same you're just going to see especially as as j frog is already integrating with some of that that it shouldn't be as awkward as it was in its infancy where a lot of that's going to be customized stuff like now it should be again a very similar workflow but my short answer would certainly be that there's going to be a lot more pieces there that you're going to have to integrate. Even just from a storage perspective,

Starting point is 00:18:50 the models are a lot larger than just pulling maybe some Python libraries. A lot of those people have worked with Python, they've run an import, doesn't crash your machine, doesn't fill up their hard drive, right? Doesn't take many models to fill up at least a modem modern slim book you know

Starting point is 00:19:05 that we're right you might just have you know 500 gig or something on your hard drive maybe three or four models could easily fill that if not just one large one yeah so now you're thinking okay we're gonna have these properly set up to be stored correctly and optimizing so there's a lot more at play at the very least than just a standard DevOps work. That makes sense. And so we have a lot of folks listening who are just starting their careers. What advice would you give for people? Let's say folks are in high school

Starting point is 00:19:35 or maybe folks are thinking of going back to college and they're deciding between that and a coding bootcamp or learning on the fly. What advice would you give for young folks who are just getting started if their goal is to get into a DevOps career path? Yeah, I've seen that question a lot, even a couple of friends personally

Starting point is 00:19:57 have asked that question, so I've given it a lot of thought. I think depending on what your background is, it's going to really sort of change that answer, at least to some degree. But I think the one common thing is certainly to just don't take for granted the wealth of information that's out there, whether it's, you know, just free resources like YouTube, tons of great videos that teach you all the different pieces of things like you know github github actions jenkins um you know just all the different you know you would i wouldn't necessarily get too caught up in well i need to master something like a specific language whether it's python or whatever but it's going to take at least the common core understandings of some you know programming language and having you know the basics of that because no matter what you're doing even on on the side that i am or um you know maintaining something like artifactory uh you're going to have some language whether it's

Starting point is 00:20:54 for scripting or for automation um so having some good basics of that and then also just again taking advantage of all the information that's out there on things like YouTube and Udemy that are very affordable, very accessible. When I started, you would go find the closest relevant O'Reilly book and have it by your desk. It wasn't very common to see just a few O'Reilly books or a whole shelf of them in whatever team's room that you worked in. And you'd go grab one off the shelf. Not to say that it wasn't available, but that was still sort of the habit. Like instead of searching to retrieve of maybe like various posts and things like SAC Exchange that, you know, at least early on, it may have been, you know,

Starting point is 00:21:37 more difficult than now, especially with things like ChatGPT or anything GenAI. So don't take for granted the amount that is out there. Just use that to your advantage and at least learn those common tools, GitHub and some type of version, something like Actions or Jenkins that help you automate those things

Starting point is 00:22:00 and just start playing around. A lot of those things right now are free for open source and students and things like that. I should be able to at least break in at minimal cost or even free in some cases. Yeah, totally.

Starting point is 00:22:14 I use GitHub Actions extensively for a lot of my open source stuff and they haven't charged me yet unless there's some mounting bill somewhere. Right, yeah. Unless they get a collections call tomorrow, we'll see. Yeah, so that's the great bill somewhere. Right, yeah. Unless they get a collections call tomorrow, we'll see. Yeah, so that's the great thing. When you're learning, don't be frustrated when you break things. Part of it is see how you can break it.

Starting point is 00:22:34 Break it different ways and learn how those broken things are solved and how you can figure out, change ways to keep the automation going even if it hits something that's not critical. critical maybe it was meant to you know continue on with whatever and that's sort of the idea like don't don't get frustrated as things failing like figure it out and use that as a learning opportunity basically yeah definitely yeah this is uh this is great so i think you know one common thread that we've talked about on the show that we're just double clicking on here is is you know build cool stuff that's basically the the short of it if you build awesome stuff you're going to have to maintain it i guess that's one thing with with devops is uh um i guess you'll have to build

Starting point is 00:23:17 something for other people or build something where where you know you need to have a process um but even if you don't i mean, you're still going to get a great experience as a developer and that will catapult you into a DevOps career somewhere. Yeah, absolutely. Cool. Hey, James, I know you have to rush. Thank you so much for taking the time

Starting point is 00:23:38 to talk to us. I really appreciate it. And if I get any questions for you from the audience, I'll shoot you an email. Yeah, I always have your help. And it was great being here and appreciate it happening. Cool.

Starting point is 00:23:48 Thanks a lot. Hey, everybody. So we are here at SwampUp with Luke Marsden, who's the CEO of HelixML. Thanks for coming on the show, Luke. It's great to be here. Cool. So why don't you give people a little bit of a background into how you kind of arrived into starting Helix ML and what's your kind of backstory? Yeah, absolutely. So I'm a startup guy. I've been doing startups my whole life. This is startup

Starting point is 00:24:18 number three. Back when I was 15, I started a web hosting company. Wow. And I then went and did computer science at Oxford. And out of the back of that experience, I was really inspired to try and solve some of the practical problems we had in the web hosting company. And I did that by building a distributed web cluster. And then that evolved into, we ended up pivoting that business into solving storage for Docker because when

Starting point is 00:24:45 docker exploded we were like we already have all of this tech for um dealing with stateful containers because we were using freebsd jails and so we just applied that technology to um to docker well let's dive into a little bit so we have a lot of uh high school folks listening in so 15 years old start a company how. How does that work? How do you go to somebody and say, you know, I will build this website for you and you should write me a check. Did your parents help you with that? Or did you like, how do you do that as a high school student? I mean, I had a co-founder who I met online on IRC And that's like the old school way that we used to communicate back into the ground.

Starting point is 00:25:27 I used to love IRC. It was great. I mean, I guess now it would be Discord, right? Yes, exactly. That was the Discord of our era. Yeah, exactly. And yeah, we just decided together to put together a web hosting company.

Starting point is 00:25:41 And the amazing thing about that was that we were just able to put it online, put the website up there, and start telling people about it. And people showed up. And one really fun thing happened that helped us a lot with that business was, I've forgotten what it's called,

Starting point is 00:25:59 but there's like this old Perl-based blogging framework. And the guy who created that blogging framework found our web hosting service. And then he left his company and then he said, I'm really happy with my hosting service in front of like all of his audience. And so we got a nice boost in traffic from that. That's amazing.

Starting point is 00:26:21 Yeah, it was very lucky. I don't know if I've ever told this on the show but but uh i did a lot in high school as well and one of the things i built was a little isometric game engine oh yeah and uh at the time there weren't a lot of game engines this was this was uh like 1997 or something and so there wasn't a lot of Java, really anything. Yeah. And so it got picked up in a book and it got popular. And I wonder if, I wonder if people can just build things and get them noticed or if that time has passed. I feel like now you probably need to do a little bit more promotion.

Starting point is 00:26:58 There's not, there's just so much content out there now. I mean, I, I think that's true, but true but um i think if you have a passion for solving a problem then um give it a go and if anything there's probably more ways of of getting the word out these days i mean we didn't have youtube back then like we didn't have um discord communities like i think it's true um uh and more people are online now than were then so yeah another thing is like there's the vanguard has moved so in other words if you want to make a java game engine i'm sorry like it's just there's too many of them yeah but but there's a hugging face uh llm leaderboard yes and if you you know there's a variety of them

Starting point is 00:27:42 so you might end up with some passion and some talent, you might win the Greek LLM leaderboard Absolutely So shall I continue my story? So yeah, I mean we did the

Starting point is 00:27:59 storage for, we pivoted that business hybrid cluster into Cluster hq which was solving um storage for docker and then we got involved um in the very early docker and kubernetes days back before um people really knew how to make um containers work with like databases and other stateful services. So at that point, yeah, we raised $15 million just by walking up and down Sand Hill Road and saying Docker and storage in the same sentence. Now, what is Sand Hill Road?

Starting point is 00:28:37 Oh, so it's a road in Palo Alto where a lot of the VCs live or work. Yeah. So, yeah, there's a large concentration of venture capitalists. Very cool. I mean, was that intimidating? I mean, that sounds just like an extraordinary amount of money and responsibility. It was intimidating.

Starting point is 00:29:00 It was interesting. And I think I learned a lot from that experience. I think the biggest thing I learned was that even if you have a large Series A, you need to be really, really thoughtful about not growing the company too quickly before you really truly have product market fit. And so, yeah, that was one of the lessons learned. Like we grew the team quite quickly and it was challenging. But then beyond that first company. I then had another go. We had a... We built... We started out with

Starting point is 00:29:50 this idea of versioning for development environments. Okay. So a lot of my career I've spent trying to find commercial applications of ZFS, which is a very clever bit of file system technology that came out of some microsystems and then got ported to Linux.

Starting point is 00:30:10 And so that attempt to commercializing ZFS was, well, if you've got a development environment and you manage to reproduce an interesting bug, let's say, on your laptop, then shouldn't you be able to not just do a git commit of the code at the point at which you can find the bug, but also take a snapshot of the local development databases that you have running, so that you could maybe attach a runnable snapshot of that thing to a GitHub issue, and then another developer could just pull it down and reproduce the bug immediately rather than having to click around in the UI to get the database into a certain state.

Starting point is 00:30:51 Turns out that AI and machine learning had a much bigger data versioning problem than DevOps did and software engineering. And so we ended up pivoting to building out this end-to-end MLOps platform but with that same idea of

Starting point is 00:31:05 versioning your workspace and so what data scientists like AI and ML people often do like when they're developing ideas is they use Jupyter notebooks but when you're using a Jupyter notebook it's very hard to keep track of your work very accurately and even like the order in which you run cells can affect the output and things like that. But what we did was to add this snapshotting of your state before and after you did a run. So if you did anything in the Jupyter notebook that would like train a model, for example, then we'd snapshot before and snapshot after. And then we'd also automatically build up this provenance graph so

Starting point is 00:31:45 you could say like oh i created this model from this data but this data was transformed from this other data using this process um and so you can kind of recursively build up this um uh this tree structure of how you got to that point um so yeah that that was uh that company was called dot science um and um yeah uh that was i love that name is it like the configs were in the dot science folder home directories how the name came it actually got really confusing because people would always put like a period and then science right but it was actually dot so it was oh that was a lesson in naming yeah department of transportation science yeah you can see other ways that that could get misconstrued yeah so a lot of people might not know this but zfs and btrfs yeah raw at fs they have this amazing property called copy

Starting point is 00:32:41 on right and the way it works is uh you've probably done this even for school projects and you might say like i have a document um i mean nowadays probably everything's in the cloud so it's kind of transparent but you might say like i have i have a set of artwork um that i'm working on some digital art and uh you know i'm iterating on it but i don't want to lose my history and it doesn't really make sense to create a Git repository. I might not know how to do that. The simplest thing would be just create a folder for each day and just have my Blender files or my Photoshop files

Starting point is 00:33:16 just copied in each folder. And then that starts to become really expensive because you might have all these other assets and they're just all getting copied every single day end up using up all your disk space do that exactly zfs is has a genius idea it's basically reference counting so when you copy a file zfs doesn't actually copy it it just creates a shared pointer between those two files. Now, as soon as you change even one bit of that, of either of those files, then ZFS has to make a copy. But if you don't do that, you can have 100, 1,000, 1,000 copies of the file, and it's not going to increase your storage costs linearly like

Starting point is 00:33:58 a regular file system. And so I've always been fascinated by that. I really was intent to install BitRotFS on my latest computer that I got about a week ago, but it wasn't supported by Grub or some other issues. And so it's to your point, like it's still not totally smooth yet. And it's been around for a long time, unfortunately. Yeah, well, I mean, Ubuntu embraced ZFS on Linux, you know smooth yet yeah it's been around for a long time unfortunately yeah well i mean ubuntu uh embraced zfs on linux fortunately and so you can install ubuntu with a zfs root um right so um

Starting point is 00:34:35 yeah i mean i encourage anyone who's brave enough to give that a go and you can do all these cool things but i mean we run like uh my newest company, HelixML, we run all of the infrastructure for that in a data center in my basement. Really? Yeah, because we're bootstrapping the business. So we didn't want to incur a ton of cloud costs. So I mean, it's not got a fiber link. And yeah, we use ZFS on Linux for all of the production storage for that and then actually rsync.net this is an interesting piece of ZFS trivia rsync.net has support for they're a backup provider but they have support for like ZFS send into rsync.net and then they provide that as a service so what that does is it means that you can take a snapshot every day of your production system

Starting point is 00:35:27 and it's an atomic, reliable snapshot that doesn't take up any extra disk space like you were saying. And then you can just send the difference between that snapshot and the previous day's snapshot over to rsync.net and it will automatically keep up to date. Anyway, we've ended up falling down a

Starting point is 00:35:44 file system rabbit hole no that's fine okay one last file system question because it's just burning in my mind what is the difference and feel free to just cut me off so you can take a rabbit hole turn but what's the difference between zfs and bitrot fs are they totally different things or are they two people trying to do the same thing well i like how you describe it as Bitch Rot FS. I think that's what it is, right? Or no? I think that's a joke.

Starting point is 00:36:08 Oh, really? I thought that was the official... I think it's called Butter FS. Oh, you are totally calling it... But I think you fell for it. You know, I have to admit, I've said this on Show Local Times, I am known to be the most gullible person.

Starting point is 00:36:23 You know, different people have their superhero strengths and weaknesses. Like I fall for everything all the time. So I'm not surprised. Well, no, I mean, I don't mean it in a negative way. I just think it's, I can imagine someone on the internet calling it bitch rot FS as a joke because it's like bitch rot. It's actually, I looked it up.

Starting point is 00:36:42 It's better FS. Yes. That is the official name. Or BtreeFS, which probably makes the most sense, right? Yeah. Well, I think BitrotFS is actually quite apt because it's, yeah, I mean, that project has been plagued with data loss issues and I've never really trusted it to actually be reliable.

Starting point is 00:37:08 Whereas ZFS was engineered at some microsystems like very nicely. Interesting. And I was grateful when I stopped having to run Solaris to use it. Because it got ported to FreeBSD and this is like back in the day when we were doing this web hosting company.

Starting point is 00:37:26 But yeah, anyway. Well, that is fascinating. There were licensing issues, right? But I guess it's all cleared up. Yes. And I think the, yeah, I'm glad that Mark Shuttleworth at Canonical kind of took a stand on that.

Starting point is 00:37:42 And he said like, we're going to go ahead with this. Our lawyers have cleared it. And that like really cleared the way for a stand on that. And he said, like, we're going to go ahead with this. Our lawyers have cleared it. That really cleared the way for a lot of other people to say, like, okay, if it's good enough for Ubuntu, then we can use it. Nice. That is great. That is Ubuntu's real big contribution is just getting everybody to have faith

Starting point is 00:38:02 in the whole ecosystem. So, okay, we talked in previous episodes about DevOps. We had a dedicated show about it. We've interviewed some folks about DevOps. For folks who are listening, if you missed the DevOps episode, hit the pause button on this episode. Go back, listen to the DevOps episode because what we're going to talk about now

Starting point is 00:38:23 is the delta between DevOps and MLOps. MLOps is much newer. You know, we haven't spent a whole bunch of time on it on the show. But as folks know, AI is becoming, you know, really, really important. And so MLOps itself is also becoming, you know, important by proxy. So what is MLOps and how does it differ from DevOps? So if you think of three disciplines, software engineering, just like how you write and develop software, how you test it, how you version it,

Starting point is 00:38:59 and you think about DevOps, which is how you do CICD for deployment, how you operate that, how you do CICD for deployment, how you operate that, how you do immutable infrastructure in the cloud and things like that. And then you layer in this third discipline, which is AI ML, which is like this world of using data to generate models that pull the patterns out of that data and are able to make predictions, right? ML Ops is the intersection of those three disciplines.

Starting point is 00:39:28 So it's the intersection of software, DevOps, and AI ML. That makes sense. And so if somebody wants to get started in the field of MLOps, how do they not get uh overwhelmed with all three because all three of those are you could spend a lifetime they're all crafts of their own i think data science and and data engineering is a craft software engineering is a craft you know ai and you know making the loss go down it's a craft craft I've spent many, many years on. So how do people get in that intersection?

Starting point is 00:40:09 Is it the kind of job where it's really more of a mid-senior level or can folks kind of get into that and what would that look like? I would say that in order to become an MLOps practitioner, you only need a little bit of all three. So for example, if on the software side, you learn a bit of Python and you're comfortable with Git, on the DevOps side, you get comfortable with Docker and containerizing things and maybe deploying Docker Compose, maybe push out into Kubernetes. And although I'd say Kubernetes is optional. And then on the AIML side,

Starting point is 00:40:50 if you just like train a linear regression model or something in PyTorch or like XGBoost, I mean, that's enough to get you started. And then you can start looking at tools like MLflow, which allow you to keep track of model artifacts and track runs. That was something we were big on in my second company, DotScience, was this idea of run tracking.

Starting point is 00:41:19 And yeah, deploy one of these models into production using Git, Docker, PyTorch or whatever, and now you're an MLOps engineer. And then you can take it from there and there's lots more sophistication and how you scale the systems and so on. But basically, that's what you need. So it might sound intimidating

Starting point is 00:41:40 because it's like three different disciplines. But yeah, if you just take a little bit of each one um then you can get up and running and i would uh shout out to the mlops community um so mlops.community uh is actually a community that we started at the end of dot science um actually like right when the pandemic started uh our sales pipeline dried up and we were like what are we going to do so going to do? So let's start an open community around MLOps. It's like, I'd always wanted to do that. We bootstrapped that community from nothing with another tech community called BrizTech in Bristol, where I live in England.

Starting point is 00:42:20 And then my colleague Dimitrios took the MLOps community and ran with it. And we're now like over 20,000 people on Slack. Wow. We've got a meetup in San Francisco this Thursday, which we're hosting. And there's meetups all over the planet. So it's amazing how these things can evolve. Wow, that is remarkable. But there's tons of really great material on MLOps, on the MLOps community.

Starting point is 00:42:42 There's a good YouTube. And yeah, I'd recommend that as a resource yeah that is great i love how like you took such a kinesthetic approach to it i i'm right on the same page i think uh um i'll read the book when you know i've kind of uh have something half built and it doesn't work and it's like okay now it's time to read the manual i know some people are the opposite i used to report to my manager used to be uh peter norvig at google and he would he would read the book first so it's like oh we're going to use python okay i'm going to just start reading the python manual from page one yeah my wife is like that yeah yeah i mean but uh i think that uh for me like what works is starting with some kind of problem and what i've learned over time is actually even better than that and i'm this is

Starting point is 00:43:33 where i'm not um there yet but i'm trying is actually starting with a customer yes you know i think uh starting with uh you know a person who has a problem but you know before that I think using yourself and saying okay what is a problem

Starting point is 00:43:49 in the real world for me starting with that and working backwards to you know what AI thing do I need to build and then

Starting point is 00:43:56 what language do I need to build the AI thing and go from there well if I may tell the story of the most recent company like Helix amount

Starting point is 00:44:04 yeah definitely dive into that so like um uh yeah after dot science i did consulting for a few years um and worked with clients all over the world which was great i really recommend it actually like being a one man or one person consulting company can be really amazing. But then what happened, I was watching this kind of open source AI space kind of towards the middle to end of last year, like August, September, 2023. And I saw these two really interesting things happening. The first one was that MrL7B came out. And now suddenly, suddenly you could have a good quality LLM,

Starting point is 00:44:47 like a chat GPT level-ish LLM that you could run locally on your own machine. And the other piece was that it became possible to both run but also fine-tune those models on consumer hardware. So you could now have an almost chat gpt level um model that you could fine-tune on your own private data on like a single 3090 like the kind of gaming um gpu that you might have in your in your home pc now like let's dive into that a little bit for folks so you know a lot of people have heard the word fine-tuning yeah but like how do people actually

Starting point is 00:45:23 do that here's i'll say what i think it is and you can correct me and fill in the gaps so when they trained mistral they had some pytorch or tensorflow or mx net whatever it is they had some code to train mistral they give people that code and so you can basically run that code on the current model and basically continue training on your own data is that is that pretty much how it works exactly so fine tuning is just more training okay um so you take the weights of an existing model um and you train it more on training data that is your own private training data um it is a little bit more complicated than that because there's a technique that's often used called low rank adaptation, which I don't fully understand the math, but

Starting point is 00:46:11 it's some sort of matrix decomposition where you end up just having to train like a much smaller set of weights than the whole set of weights. And that's really cool because it makes it tractable to train or to do more training on this model but without needing huge memory requirements. And that's what I mean by it became possible to fine-tune Mistral yourself on a single GPU. And that

Starting point is 00:46:38 depends on that kind of lower... So is it holding the original model in the CPU memory and then the GPU memory has the low-rank version? That's a really good question. I mean, I think everything fits in GPU memory, but I think it only needs to do backpropagation on the smaller matrix.

Starting point is 00:46:57 Oh, that makes sense. Yeah, so it's able to do that with fewer resources and it also just takes less time. Right, that makes sense yeah because for folks who don't know the um the way the way back propagation works and we talked about this a little bit in the ai episodes but you keep this gradient matrix so effectively doubles the amount of memory you need because for every matrix you need this sort of shadow right yeah and if and you might need to up the resolution

Starting point is 00:47:25 too like maybe the model can run an 8-bit but for training it has to be 16-bit and so so that now you're talking about a 4x multiplier and so if you can just do inference on the big model it can sit in 8-bit yeah and then do the training on the smaller 32-bit or 16-bit model. That is really cool. Yeah, so I noticed these two really interesting things happening in the world. And I turned to my friend and co-founder Kai in Bristol and I said, it's time to have another go. So was he with you at Dark Science?

Starting point is 00:48:01 Yeah, we kind of got the band back together. So for helix um yeah we um we saw this this opportunity and we went in and we we then like spent two three months furiously hacking together the stack i mean that we we joked that it took us 10 years to know what to build in 10 days yeah that's how it goes it goes. But yeah, we put together this stack that allowed you to deploy these open source models like Mistral 7b and also fine tune them. And to make that easy for people with like a nice web interface where you just drag and drop in some PDFs or some documents. And then we did this interesting piece around the fine-tuning where we would take the source documents, chunk them down into little pieces, and then we would use an LLM to generate training data from those source documents.

Starting point is 00:48:56 Because what you want to do when you're fine-tuning is train on data that's similar to the kind of questions that a user would ask of it, assuming it's an instruct style model, like a question answering model. So you can't just train it on the raw text because then it will just be good at completing the raw text. But what you need is you need to train it on things that are like the questions that users are going to ask. So we actually use another LLM

Starting point is 00:49:22 to transform the source documents into questions and answers about the source documents. And then we use those question answer pairs to fine tune the model. Our most popular blog post was how we got fine tuning Mistral 7b to not suck. And that goes into a bit more detail. So maybe this is a dumb question, but these models are known to be trained on enormous data sets. And I think there's one called the pile. It's like 10 terabytes of text or something. Some enormous amount.

Starting point is 00:49:56 How does your fine tuning have any statistical significance when you compare it to just this enormous data set? Like how are you able to, to move the model, move the needle right on the model? Yeah, so I think it has to do with like, the learning rate that you choose, when you're doing the fine tuning. And what, what has been found kind of empirically is that even just using a very small number of samples in fine tuning with like, I guess, a relatively high learning rate. So it makes them makes the model change quite a lot, but not too much. Because if you go too far, then it just goes like

Starting point is 00:50:34 haywire and starts spouting garbage. So you've got to find this like middle ground. But even with a small number of samples, you can get the model to start generating things that are similar to the fine-tuning dataset or start generating responses in a different style or with a different structure really quite quickly. And then it's about finding that trade-off. You don't want to over-bake the thing so that it just kind of memorizes that and forgets all of its prior learning from from uh from the pre-training um makes sense but

Starting point is 00:51:07 yeah we we like tuned those parameters and we found something that worked um but then the interesting thing was i guess commercially from a business perspective is that we went into the market in december last year um with two hypotheses the first hypothesis was um let's uh people will care about running models on their own infrastructure. People will care about local LLMs. And the second hypothesis was, people will care about fine-tuning, and in particular, fine-tuning for knowledge, which was that piece that we developed. And what we heard back from the market was a resounding yes to running models locally so we launched on december 21 um uh on december 24 uh this company shows up on our discord and they're like we're based in

Starting point is 00:51:55 germany we're really interested in self-hosting these models we're not ml experts but we see the opportunity and by january 1 they'd integrated Helix already into their stack. Wow, this is incredible. And then they were going after big enterprise customers in partnership with us. And so that was really exciting. It's like the universe was telling us there is an appetite for running LLMs locally and building the capabilities to make that easy to do.

Starting point is 00:52:27 Now, in terms of the second or the first, well, the second hypothesis about fine tuning, we actually got a big meh from the market. Like we put all this effort into fine tuning stuff. Everyone just said like, oh, we just want to do RAG, like retrieval augmented generation. Yeah, interesting. Why do you think that is in hindsight? I think because fine-tuning for knowledge is slower than RAG, and the results are not as good.

Starting point is 00:52:56 So it was kind of an experiment to see, oh, well, maybe people will care about it. But yeah, I mean, and so what we did was we extended the stack. So we added RAG to it. And the other thing we did was we added API calling, which is where you can give the model an open API, which is confusing because it's not open AI, it's open API spec.

Starting point is 00:53:20 It used to be called Swagger. A Swagger spec. It's easier to just describe it as a Swagger spec. So you give the model a Swagger spec, and then you give it a query from the user, and you basically say, a bit like how function calling works, you say to the model,

Starting point is 00:53:35 you've got these three APIs you can call, and please class... So it comes in three parts, the API calling. The first part is classify the user's question and then tell you whether the user's query requires an API call. Yes or no, and if so, which one? And then the second step is constructing the API call. So assuming that you want to make an API call based on the user's question, like, can I rent a crane that can handle three tons in Hamburg next Thursday,

Starting point is 00:54:12 like construct an API call that will query the product catalog with the correct question. And then the third part is taking the response from the API. So the system will execute the API call for you. And then taking the response from that API and summarizing it back to the user. So from the user's perspective, they're just saying, like, hey, have you got a crane? And the model quickly says, yes, I've got one. I've got these three available. They cost this much.

Starting point is 00:54:34 But what's actually happening underneath is that classification, API call construction, and summarization steps. Interesting. So that API calling feature has been really commercially successful for us and tons of people are interested in that. And so I'll just shout out quickly to the fact that we did our 1.0 launch

Starting point is 00:54:54 last week. So yeah, if you're interested in running models locally, we now run on Windows, Mac, and Linux. You can run it alongside Ollama. We have a nice application editor, so you can click buttons to set these things up and add knowledge with RAG,

Starting point is 00:55:13 add API integrations, and so on. So if people don't have, let's say, a GPU, could they still run Helix on, let's say, an EC2 instance? Yes, and you can even just run it on CPU because Olamo runs on CPU as well. Oh, okay. Got it. However, if you have a Mac, like one of the more recent M1, M2, M3 Macs,

Starting point is 00:55:35 Olamo also works with the GPUs on those machines. So you actually get really quite good performance on a Mac. On a CPU Linux or Windows machine, it will be a bit slower. But if you've got like a gaming rig at home running Windows, then you can install WSL2, like Windows subsystem for Linux and Docker, and then the whole thing does work. Yeah, it's pretty cool. That is awesome. So yeah. Or you can dual boot and have a Linux machine with ZFS on it. Yes. That is another option. That's really cool. So how does that work?

Starting point is 00:56:09 It's like a desktop or a library. I guess I'm trying to figure out. So in the case of OpenAI, they give you a key and you're calling into their server with their key and that's how they do billing and all of that. In your case, how does that work? Is it, I guess, also a key, but you run, you get the key from from helix.ml i so we are docker desktop license so like if you have less than 10 million dollars of annual revenue then you can use us for free oh perfect just download

Starting point is 00:56:37 it on the website like and uh it's um it's an install script so you just like run our installer that creates a Docker Compose file you run Docker Compose up and then you look at localhost in the browser and the whole thing is running there but that means of course you can then also deploy that if you're a company you can deploy that on your own internal infrastructure

Starting point is 00:56:57 and that's where the DevOps piece comes in we also support Kubernetes so you can run the whole stack on Kubernetes and yeah we use Kubernetes We also support Kubernetes, so you can run the whole stack on Kubernetes. And yeah, we might use Kubernetes with a bunch of GPUs in production for our own service. Very cool. Yeah, this is fascinating. Okay, so maybe we'll wrap up with like, what's one last piece of advice for, let's say someone's listening right now and they are completely infatuated with this.

Starting point is 00:57:26 They want to get into MLOps. We talked a little bit about things to study and all that, but in terms of maybe more of a mindset or kind of lessons you've learned on the soft skills side, what are some advice that you can give to folks out there who want to get into this field? Yeah, I mean, I would recommend joining the MLOps community

Starting point is 00:57:49 or other communities to have peers to talk to about it. And then, yeah, I would say start playing with the technology. Like I said earlier, if you want to get into ML Ops, like play with Python, Git, Docker, PyTorch, that kind of thing.

Starting point is 00:58:11 If you want to get into this newer field of LLM Ops, like how you manage RAG and API integrations kind of on the other side of the LLM API boundary, if that makes sense, then download Ollama and spin up an LLM locally. If you like, like play with Helix and set up like an API integration. And then that'll set you up well

Starting point is 00:58:35 to be able to go into maybe like a job interview or something and say like, I've got this experience or to build a site project. Yeah, totally. Amazing advice. or something and say like i've got this experience or um or to build a site project yeah totally amazing advice i think uh we covered um we actually covered docker in the kubernetes episode which is uh you know we should have in hindsight maybe had a dedicated docker episode uh but check out the kubernetes episode if you haven't already uh we talk about minikube and there's probably better things out there that episode is a little dated now that's okay minikube's solid is this still the thing or there's kind as well that's right yeah used kind recently um but uh yeah get get set up on your

Starting point is 00:59:16 machine and uh check out helix ml totally free uh for folks out there um if you're going to install it on your work computer and you work for like Google or some company that has a lot of revenue, you probably need to check with your boss first. Or just come talk to us on Discord. Yeah, or just talk. Yeah, exactly. If you're a hobbyist,

Starting point is 00:59:33 just install it and get started and try things out. Yeah, awesome. Cool. All right. Thank you, Luke. Thank you so much for your time. It's been awesome. Thanks for having me on.

Starting point is 00:59:42 Cool. All right. So as part of this three-parter you know we uh we just had uh luke on to talk about um you know what is ml ops and kind of get us started on that and explain his story and now i'm really excited we have yuval fernbach here who's the cto of jfrog, and he's going to explain to us more about the whole kind of supply chain of AI software. So thanks so much for coming on the show, Yuval. Thank you. Thank you. And really nice to be here. Cool.

Starting point is 01:00:15 Yeah. It's a pleasure to be here. Yeah. And just to recap, we are at the SwampUp conference, which is, I think, hosted by JFrog. Yeah, it's hosted by JFrog, and it's actually my first SwampUp, so I'm excited to be here. That is awesome. So before we dive into MLOps, why don't you tell us a little bit about your story? What kind of led you to JFrog?

Starting point is 01:00:38 Sure. So previously, before joining JFrog, actually only two months ago, I was part of Quark. Quark is an MLOps platform, and I was one of the co-founders and CTO for Quark that was acquired by JFrog end of June and now basically part of JFrog ML. So before being part of Quark and founding Quark, I was working for Amazon Web Services for five years.

Starting point is 01:01:05 I was part of the AWS Machinery and Service team, basically working with the AWS customers on their challenges around machinery. So I've seen hundreds of customers, hundreds of companies trying to solve that, trying to understand how they can start iterating on machinery, building models, experimenting, and eventually impacting the business based on machining models and AI applications. And that's basically one of the main reasons why, together with my other

Starting point is 01:01:37 co-founders, we decided to fund Quark and help those companies to actually achieve that, to make sure that they impact their business and build models, not just for the research or for the development, but actually making that supply chain work and affecting their production as well. Cool. So, you know, I think one really unique story I'd love to dive into is you're at AWS, which is a very large organization, right? Roughly how many people are there? How many engineers at AWS? Maybe tens of thousands?

Starting point is 01:02:12 I've been there for five years. And in that time, it changed from a few thousands to probably tens of thousands by the time I actually founded Quark. So it changed quite a lot. I'm not sure what are the numbers right now, but by the way, it was amazing to see how a company that was relatively small,

Starting point is 01:02:32 at least versus Amazon, actually grew and became such an amazing business. Yeah, definitely. Yeah, I feel like the documentation, the quality of service of AWS, I haven't used other ones recently, but I remember, this is years and years ago, looking at all the different options

Starting point is 01:02:50 and just seeing just a higher level of quality for AWS. Things just tended to work and the documentation was really well done. And there's an extraordinary amount of effort and user study and things that go into that. Yeah, I must say that I learned a lot during that time because I've seen the company that basically founded the cloud or the first company that created the cloud

Starting point is 01:03:14 that understood that their customers are not IT, their customers are developers that need to build software. And they build that as a self-service, basically product-led development before anyone talked about it. And I think that's one of the reasons why eventually we found the quack is that we saw how that company grew.

Starting point is 01:03:36 There were so many services and it became really, really difficult for companies to actually utilize the different services. And not just utilize different services, but the challenges have grown. You cannot just utilize different services, but the challenges have grown. You cannot just use EC2 and S3 anymore. You need different high-level

Starting point is 01:03:51 solutions to make your product work. And I believe that it's amazing what AWS built. But nowadays I think that in many cases, using those building blocks directly

Starting point is 01:04:07 and trying to build on top of that is actually too much for many of the companies. And actually, the amount of investment that you need to do to actually do that, the amount of engineering work that you need to do to build, for example, an ML solution, is just too much for many of the companies. And it's really difficult to do

Starting point is 01:04:25 that. Yeah, that makes sense. I kind of experienced this firsthand where you can stand up things with the web UI, you know, and so it's like, I want an EC2 instance, I want a database. But then at some point, you need to programmatically do that. It like oh i need a beta database and the beta database needs to have all the same things and you just you just forget what you clicked on a month ago so you need this infrastructure as code and then that's a rabbit hole and it's hard to hire people who have that talent and so it becomes difficult yeah yeah it's it's it's actually amazing to see how companies have grown to a place where they understand that it's not single-point solutions. You need to create platforms.

Starting point is 01:05:10 So, for example, going to the software supply chain, understanding how a software supply chain looks like, it's not just a CI, CD. It's your source code and how you manage your binaries and then how the security infrastructure looks like on top and then why software engineering is different than data science and how it's all connected to the same supply chain. And eventually, this challenge is not just, you know,

Starting point is 01:05:35 spin-up instances and databases and writing code in Git. It's way more than that. It's having a system that can actually work in scale and allow your company to be efficient both in, you know, once you have like two, three, five, 10 developers, but the same efficiency once you have thousands of developers and you want to actually make sure

Starting point is 01:05:56 that you deliver on time and not just deliver on time, but you delivered a trust for the products, products that you can trust both in terms of the features but also in terms of the security and make sure that they work in production exactly the way that they should

Starting point is 01:06:13 based on the real product spec, for example. That makes sense. You went from giant company to Quark. I'm assuming maybe a handful of people when it started. Yeah, sure. Yeah, and so...

Starting point is 01:06:28 When it started, you know, it was four people, but of course we grew over time. Yeah, so was there sort of a shock, you know, the moment you went to Quark and you're like, wait, there's no IT department. There's just me, you know? Like, how did you navigate that? Because I know a lot of people who go from big to small company.

Starting point is 01:06:47 And there is a risk. Like, there's some people where it just doesn't work out. And I mean, that's okay. They go back to Facebook or whatever. But, you know, what was that experience like for you? You know, there's always a risk. And I think that founding a company, being one of the co-founders or even one of the first employees

Starting point is 01:07:07 is not for everyone and that's great because I think that the challenges that you have doing that are different challenges, different experiences than working with a giant company like Amazon. So I personally love it. I think it's

Starting point is 01:07:23 an amazing experience. I love the fact that it was on me. Like, if I had a problem with my computer, I needed to fix it. I needed to talk with, you know, the person that we bought the computer from. I needed, so, you know, it's part of, like, the experience

Starting point is 01:07:39 of founding a company, being part of a small organization, a startup. So I really love it, but, of course, it's not for everyone. And I think that one of the challenges is that you need to make sure that you focus on the right things, the right challenges, because there are many things to do, but your main goal is always to build a successful company

Starting point is 01:07:59 and make sure that your customers actually get the benefits of your product. The customers are actually happy with the solution that they get and want to use you and, of course, recommend you to their friends and colleagues. Yeah, that makes sense. So maybe before we jump into the ML Ops, what was Quark? What did Quark do and what does Quark continue to do through the acquisition? So, as I said, Quark was founded four years ago and we started from day one to focus on

Starting point is 01:08:32 this challenge called MLMs. So basically allowing companies to start from the development to the production without the need to manage multiple solutions, multiple platforms, and even utilizing or actually handshaking the code between different stakeholders during that process. So from day one, our main focus was production, how to make sure that models will work in production, and of course, going left or going back from production

Starting point is 01:09:05 on how you can make sure that the supply chain actually works from the development to the production and smoothly as possible. So we started with that. And I think that one of the challenges with MLOps is that it's not just around the models. It's around the data as well. So you need to both have that solution for data that

Starting point is 01:09:25 allows you to, again, have data for production, have data for inference, but also data for training and have the way to manage that data, those features. The same way as you do that for models, and of course, nowadays, the same way that you do it for Gen AI applications,

Starting point is 01:09:41 LLMs, things that are a bit different by nature, but eventually are based on the same infrastructure and the same questions and answers to evaluate the model. Well, it's going to be pretty, pretty good. And so that might sound like obvious, but it's actually really difficult to not accidentally leak the label. It's so difficult to keep sacrosanct like your evaluation set and keep it from accidentally, especially when you're doing aggregations at so many levels, from accidentally sort of poisoning the well. I fully agree.

Starting point is 01:10:36 And I think that like having the proper data platform for AI and machinery applications is actually crucial because it's not just making sure that you manage the data set right and understand what's the difference between a training data set and evaluation data set. And I agree that's crucial because that's the only way for you to understand the metric of a model. It's also understanding that the data that you build the model on is actually available

Starting point is 01:11:03 during prediction, during inference. So, and by the way, this is I think one of the reasons that many ML projects fail is because their scientists are doing research on the data. They get the data from the data warehouse. They train the model, but eventually

Starting point is 01:11:20 that data doesn't exist during prediction. It doesn't exist during inference. And many projects fail because they created an amazing model, but they don't have this data when the application needs it, when the application calls the model. The application doesn't know, I don't know, the history about that specific persona during that time and cannot get that because maybe it wasn't calculated yet.

Starting point is 01:11:45 And one of the things that we started with in Quark is to create a feature platform that allowed to create both an offline feature store and an online feature store that are basically based on the same calculation. The online feature store allowed to get a low latency current value of your features, while the offline feature are allowed to get

Starting point is 01:12:05 all that training data that was calculated exactly the same. So you know that there is no basically drift between your offline or training data and your inference data. It's exactly the same. It was calculated the same. And the data that you are trained on is actually available during inference.

Starting point is 01:12:23 Yeah, I mean, the story I always tell folks was there was this product called Google+. It died along with a whole bunch of other Google products, but we were doing the friend suggest and a bunch of ML stuff for Google+, a long time ago. And there was an issue where the day of week feature was zero based at training and one based at serving and so sunday didn't exist and saturday i guess also didn't exist i don't remember exactly what happened with saturday i think it was a one hot encoding yeah so sunday and saturday both

Starting point is 01:12:59 didn't exist and it caused all kinds of chaos and it was extremely difficult to find because you just would give bad suggestions on the weekend and and there's no compiler error there's there's no uh easy way to find they're written in two different languages you know the trading and the serving system so uh so yeah i think there was a product feast which was a very early attempt at this um but um yeah how has that evolved? Is it to the point now where you can write it once and it works on both? Again, I believe that the feature storage

Starting point is 01:13:33 is a crucial part of an email platform. And Feast, by the way, is an open source that still exists. But I think that one of the main challenges with such solutions like Feast, like the cloud vendor solutions for feature storage, that it's not just about the storage. It's about the data pipeline as well.

Starting point is 01:13:51 You need to have a data pipeline that eventually can connect both the offline and the online on a single process. So you write your code once in the same language, and that code will be used to create the data for both. And again, it's really connected to what you just said because in the past, many models were, for example, written in Python, but then the production implementation was, let's say, in C or Java or whatever, and you needed to recreate the features in a different language, and

Starting point is 01:14:20 there is no actual way to compare the two and understand that the features were created the same. So many companies had the same challenge that you just talked about of having that, you know, that difference between the production and the training. And again, it's the same challenge with a feature store that are only the storage level like FIS and like others

Starting point is 01:14:41 because you need to have that same data processing layer that will be used to create both features. And I must say that part of it is that Python now actually grew enough to be good enough for production, because in many cases, it's not really Python production. You build a model in Python, but then eventually it's a C object that's running in production, although the code is in Python. So you can actually use the same code and get pretty good performance.

Starting point is 01:15:09 So that's, of course, part of the technology enhancement that happened during those years. Oh, very cool. So a lot of projects start with Jupyter Notebook, which is kind of itself kind of this extension or descendant of Mathematica, right? Mathematica notebook, which is this beautiful, like, you know, interactive thing.

Starting point is 01:15:30 And it just becomes really hard then to, to, to productionize that because it's not dot pi files, it's all these cells. And, and so how do you recommend for folks to, to go from that notebook to something that can run at scale? So it's a great question because I needed to talk about supply chain. And that question eventually pivoted me back to supply chain. And I love notebooks. I think that notebooks are great for some things. For example, if I want to visualize data, if I want to have an interactive environment, notebooks are amazing for that.

Starting point is 01:16:07 By the way, IDs are amazing for other things. For example, debugging code, IDs are way better for that than notebooks. There are advantages for each one of them, but both are not built for production. Eventually, when you want to build for production, you need to have a proper CI. You need to have a proper supply chain

Starting point is 01:16:24 that moves that code to be an artifact and from being an artifact, eventually deploy that. And of course, making sure that you have a lineage between the deployment, the artifact, the source code, all that should be connected in a way. So what I've seen is that notebooks are great, are amazing, but eventually once companies understand or graduate to the place

Starting point is 01:16:49 that they want to be to production, they need to add more structure to the way they build the models. And for example, one of the things that we've done in Quark and now with J4ML is to have an opinionated way of how to build models in terms of structure that allow every new data scientist to look

Starting point is 01:17:06 at the model that they never looked on before and understand, okay, this is the training of the model. This is the code for the inference. This is the code that fetches the data from the feature store. Like, immediately understand the structure of a model, immediately understand how that model was built and be able to help

Starting point is 01:17:22 or train a model that they haven't seen before just by looking at the code and looking at that structure. So I believe that notebook, it's part of the research, but in some phase you need to move from the notebook to have that

Starting point is 01:17:38 code managed, for example, in a source version solution, so let's say in a Git solution, in a way that can actually be automated. So you need to have some kind of structure of what the training job looks like, what the prediction looks like, what the depends,

Starting point is 01:17:55 well, the depends is which packages you need to run that model, because packages, especially in Python, a new version can come out and break everything so you need to actually freeze those dependencies to make sure that you have that model actually reproducible and not just

Starting point is 01:18:13 train once on a notebook on someone's computer and will never work again yeah definitely and you have to be able to go back as well like we did these really hacky things but there's got to be able to go back as well like uh um we did this these really hacky things but there's got to be a principled way like once you've you you have it in source control and it's in production you're always going to want to do more data science because you are going to change

Starting point is 01:18:37 the product the product changes without you because new customers adopt and some uh uh churn and everything. And so being able to run a Jupyter notebook on your production code and then make changes, that whole integration is extremely complicated and so valuable. Yeah, yeah. Models degrade over time. That's, I think, the way a model looks like

Starting point is 01:19:03 because data changes over time. So part I think the way a model looks like, because data changes over time. So part of having the structure, so by the way, one of the things, for example, that we do with J4ML now is that whenever I build a model, I automatically basically copy or freeze the

Starting point is 01:19:19 source code that I used to build that model, I freeze the dependencies that were used to build that model, I create the model artifact, the trained model, but I can reproduce that model at any point in time. I know which configuration I used, I know what was the source code I used, I know what is the data set I used, and I

Starting point is 01:19:35 think this is a practice that companies must utilize. Practices must make sure that they have the ability to reproduce a model because they will need to train that model again. They will need to fine-tune it. They will need to

Starting point is 01:19:51 understand, even without model monitoring, even without really looking at the data and seeing the drift, it probably happens. And if it happens, it means that they will need to retrain that, and they will need to have the structure that allow them to retrain that

Starting point is 01:20:08 even if the data science that build that is not available anymore, is not in the company or just your team grew and you have more projects. Yeah, that makes sense. So on episode 158, we actually had Bill Manning from JFrog

Starting point is 01:20:24 come on the show and talk about software supply chains. And so folks should definitely listen to that episode if you haven't already. So what's the delta between that and AI supply chains? How does AI kind of make that problem different? So first of all, the process itself looks different. So, first of all, the process itself looks different. Like, an ML project starts from research, it starts from experiments, you need

Starting point is 01:20:52 to track those experiments. Those are phases that do not exist with software. Usually, when you build software, it happens because of two reasons. First, you have a new feature, and second, you have a bug. So, those are the reasons why you start working on a software project. But with ML, it's also because your data has changed.

Starting point is 01:21:09 It's also because you have more features and you want to make that model better. So there are many reasons why to work on an ML project, and it always starts with some kind of research experiment. And those phases look entirely different in terms of AROps, MLOs. Same for the deployed model because the monitoring that you do for software

Starting point is 01:21:31 is only the infrastructure monitoring, maybe logs analysis, maybe understanding if there are bugs and what are the latencies of different processes and those kinds of things. But with models, you actually need to monitor the data as well. You need to monitor the model and make sure that this model

Starting point is 01:21:46 actually gives the right impact on the business. Make sure that this model doesn't degrade over time. So the process itself looks a bit differently, but eventually what happens in the software supply chain process is that you have code, then you create artifacts, and then you deploy those artifacts. And those parts are the same with AI, with ML. Maybe some configurations are different.

Starting point is 01:22:11 Maybe, for example, nowadays with AI, you have also prompts as part of your artifact. Maybe you have an external model like, I don't know, ChatGPT or something like that, and there is a new version. And it's not something that you manage, but it's still an API that you call, it's still an application that you need to manage. So eventually it's the same artifacts behind the scenes,

Starting point is 01:22:30 same idea, but different processes. And even the security posture, even the understanding if a model is trusted enough, is trustworthy, that's pretty much the same as any other kind of software. So if you build software and you scan those software and you scan the dependencies and you scan your runtime environment

Starting point is 01:22:49 and you want to make sure that the same software that you build in runtime don't have vulnerabilities and you scan it from the source code and up to the right time and understand how your security posture looks like, it should look the same with machine learning because eventually those are packages, those are artifacts and third-party APIs that need to be secured and need to make sure that you manage that lifecycle in a way that you trust. Yeah,

Starting point is 01:23:15 that makes sense. This was awesome. Thank you so much for your time. This was really great. One last thing you have for students who want to get into this field. What's some advice you could give them? So, first of all, it's an amazing field, so do it. It's great advice. But second is that I think nowadays with all

Starting point is 01:23:38 the Gen AI models and LMs, the entrance point is actually easier than in the past. Like in the past, you needed to understand the statistics around the data and you needed to add this data to start training models. And there was quite a lot of understanding, quite a lot of

Starting point is 01:23:54 work you needed to do to actually have a model ready. And nowadays you can start understanding that just by using, for example, ChatGPT. I think that as an example, my father only now understands what I do because he understands ChatGPT. He saw what it looks like and what are the effects of models.

Starting point is 01:24:14 So I believe that this is a great starting point for every student that wants to start learning about machining, about AI, play with those tools. Understand our specific prompt change, change the way the model reacts, and eventually, those are models that are maybe more complicated than the models that we've done five years ago

Starting point is 01:24:34 that we use for specific point solutions. But eventually, those are models that use the same technologies behind the scenes. And understanding how those models behave will give you quite a lot of knowledge about what the development looks like and how you can utilize machine learning

Starting point is 01:24:52 for every kind of task. Very cool. Thank you so much, Yuval. I really appreciate your time. Thank you very much. All right, everyone. So we have been talking in this episode about ML Ops. And I'm really lucky that we have Steven Shin on the phone, who's done a lot of really interesting work with this. And we're going to focus on LLMs and GraphRag as kind of two case studies of getting ML and AI implemented, and then what that whole process is like, and then getting all of that into the hands of customers. So welcome, Stephen. Thanks for coming

Starting point is 01:25:34 on the show. No, very glad to be here and excited that I'm speaking at SwapUp actually on the same topic. So I'm going to be talking a bit about knowledge graphs plus LLMs, and in particular, how you can apply them to your DevOps pipeline. Cool. So before we jump into all of that, why don't you give us a quick background? What was your path that kind of led you to where you are now? And what are you doing right now? Yeah, so maybe we'll start with where I am and go backwards. So I'm VP of developer Relations at Neo4j. Neo4j is a graph database company, but also does a lot

Starting point is 01:26:10 with generative AI and machine learning and is building out architectures for GraphRack that a lot of enterprises are using. Before this, I was working at JFrog for basically the same role, VP of developer relations.

Starting point is 01:26:28 And I did this Oracle, slightly different role, but basically the same role. I was running the developer marketing team. And the way I kind of got into developer relations in general, because I was also a developer advocate for a bunch of years is um i made the mistake okay of writing a book oh okay um now i'm not discouraging anybody from writing books it's a wonderful way to increase your reputation to like like teach and explain things which you're passionate about to a larger audience um and when i wrote the book i i was um fortunate enough to have a some great co-authors who i collaborated with on the title um it's a lot of work you're basically like giving away six months to a year of your life where you have no weekends no evenings especially if you have a

Starting point is 01:27:24 day job that you also have to keep going um and then when you finally finish the book the publisher says has oh that's a wonderful book we've released it on you know amazon bookstores etc but can you help us promote it oh you become the promotion arm for the book like if you're not submitting to conferences and talking on a topic if you're not book. Like if you're not submitting to conferences and talking on a topic, if you're not on social media, if you're not like out there being a vocal advocate for the technologies you care about. And,

Starting point is 01:27:54 and you know, obviously, you know, pointing out that there might be a good book that folks can read for more information, then you're not doing your job. Now that, that first book I wrote,

Starting point is 01:28:04 Oh my God, it must have been i'm dating myself like 15 years ago so you knew a technology and and the uh a publisher approached you and said you know steven you uh are really gifted in this technology why don't you write a book on it that's kind of how that went down yeah and actually that when i when i was in um college i assumed it was the opposite i assumed like you wrote a book and you went to publishers and you said look at this great like book that i wrote would you want to would you want to would you want to put this onto your brand and publish it but no no publishers especially tech publishers work exactly the opposite they have a a roadmap these are the

Starting point is 01:28:47 titles we want to publish these the topics that matter to us occasionally you can influence that but not until you're an established author you have really good relationships with the editorial group and they then say okay for these for these books we want to author for these titles we want to author who who would be a good candidate like who has the expertise the who can authoritatively write a book and and help us market it later on and so you had the footprint then that they were able to find you so you had done some kind of promotion to get to that point yeah so i mean it was for java effects technology and i i got in really early, like beta days.

Starting point is 01:29:26 I was already building applications with it, had a lot of good connections in the JavaFX community, and I got invited with somebody else who'd written another book with the same publisher, and now this was the first real JavaFX book. Now, fast forward 15 years later, and we got asked to update basically the same title for Java 21 and 23 with all the latest features and capabilities. So it's become, for this small niche market, JavaFX,

Starting point is 01:29:56 which I don't actually do professionally anymore, but I still keep up on it, and I'm very close with the community. This has basically become the authoritative guide for client developers in Java. Wow. I'm totally dating myself here, but the last time I professionally wrote Java, it was this thing called Google Web Toolkit.

Starting point is 01:30:18 Oh, yeah. And you would create JavaScript with Java. This was one of these, in hindsight, in my opinion, really bad ideas. I mean, maybe not. And I was writing, I think, in Java 6 or something, and it was compiled to JavaScript. It's kind of a really strange... I mean, that was my first introduction to Java.

Starting point is 01:30:40 Yeah, GWT was kind of an interesting approach. I think we've come a long way there with JavaScript frameworks which do all the heavy lifting. So basically you can have a very feature-rich application without a heavyweight backend. So I think that's

Starting point is 01:30:59 become the modern development framework. GWT was an attempt to do that, have the heavy front ends, but then have you write it entirely in Java and then deploy all that JavaScript magically and have the web application just appear. It's technically difficult to do that perfectly, and therefore it couldn't keep up

Starting point is 01:31:23 with modern JavaScript frameworks. Yeah, and then on the other end, it really got squeezed by Node because the advantage of GWT was, for example, you could write validators in Java and validate the client side and the server side. But now you can do that with JavaScript.

Starting point is 01:31:39 And again, for reference, the Google team accepted and endorsed the use of GWT and GWT for the same acronym so they would use both ah yeah they never agreed upon what it was supposed to make sense um cool so okay fast forwarding all the way to to current time so you are actually before we dive into LLMs let's talk about this a a little bit. So what is a developer relations advocate and someone who leads a team of advocates? What is that job? Because we're going to dive into a lot of technical content here.

Starting point is 01:32:16 And when people think of public relations, they think of speeches and writing speeches for candidates, these kind of things. But developer relations, you know, you're actually building a lot. So why don't you kind of give people a little bit of scaffold there? Yeah, okay. So for those folks who don't know the job or role of a developer advocate, basically, a developer advocate is somebody who they're advocating for developers, both in the product. So if you work for a company like saying, hey, we have these users

Starting point is 01:32:50 and they're trying to do things with their product, like let's actually help them out and build features which are going to be beneficial to them. But then also educating developers about new ways of doing things, new techniques, and kind of upskilling, helping them upskill. And the way I would describe a developer advocate, the most simple description is a developer advocate

Starting point is 01:33:11 is a geek with social skills. So you have to be technical, you have to be able to write code, be up with the latest technologies and trends, and constantly learning, like picking up new technologies but you also you need to be like able to present able to do interviews like this to to kind of be very fluent so strong english skills like strong presentation skills those are all really important and um you can actually come up to be a developer advocate from either a highly technical role,

Starting point is 01:33:48 like a lot of developer advocates start their career as programmers and architects, and they get tired of just building things. They want to move up and actually be the change agent for industry and for folks who are adopting technology. So it's a great career path past, like what do you, what do you do when you're tired of just being the most technical person as

Starting point is 01:34:10 your company? On the other hand, you can also become a developer advocate by having great social skills and great kind of English language presentation skills. And one of the folks who I just hired, their name is Naya Macklin, also a speaker at SwampUp, got accepted before even joining my company.

Starting point is 01:34:31 And they came up through a background of journalism and politics, kind of being involved in political campaigns, being involved in kind of all of that writing, outreach, campaigning, and wanted to move to a more technical role, wanted to move to something which was more technical. So they went to a back to a boot camp, kind of learned Python, JavaScript, all the basic skills, you know, all the way up to, you know, building web applications, like deploying technologies. They were developer advocate at Couchbase,

Starting point is 01:35:07 now a developer advocate at Neo4j, and very early in their career. I would say that to be a developer advocate is not something that you have to be old and gray to be a good developer advocate. Some of the best developer advocates, actually most of the best developer advocates, mirror the audience. So if you're talking to a technical audience, you want to be

Starting point is 01:35:26 able to just talk to them as peers. If you're going to a university and speaking to them, you want to be able to talk as a recent graduate. Kind of like being very close to the audience, I think, makes it more credible and makes you more effective in the role. Yeah, that makes sense. Very cool. Yeah, so folks out there, this is one of many really interesting professions that we're going to learn about here in these sessions. So if you have any questions about this or any other professions, you know, don't hesitate to shoot emails to us, post in the Discord. Feel free to kind of get that, keep that conversation going. There's a lot of really on discord um and uh feel free to join and be a part of that um okay so let's dive into llm so llm stands for large language model i think now they're starting to call them foundation models

Starting point is 01:36:17 because you have vision and all these other modalities um but you know what is a large language model how would you describe that somebody yes i i think that this in the past has kind of required a technical explanation of like you know how you train the models and then how you how you can actually um build the models to to learn by feeding them extremely large data sets and then having them kind of iteratively complete the next word vector or or kind of the next idea in a chain of commands now actually it's much easier to explain this now because everyone's using it that's true yeah so if you're if you're using chat gpt if you're using copilot if if you're using any of these tools which kind of give you a language interface to talk to and interact with your code,

Starting point is 01:37:15 with the web, with an enterprise dataset, then you are using an LLM behind the scenes. And you can see that it's a very effective tool for a lot of tasks, which for humans are time-intensive and require a lot of knowledge entry, which require a lot of knowledge gaining. So it's great for summarizing information, great for writing emails. Excellent.

Starting point is 01:37:47 Don't recommend this at home for writing research papers. Not research papers, like class papers. Things which the professors say, oh, research this subject and then write a two or three page paper. It's amazing at that.

Starting point is 01:38:00 But of course, completely against most schools' rules. That's right. Don't violate your school policy. It's really good at portmanteaus. You can say, give me a portmanteau of these two concepts, and you will come up with some incredibly brilliant names of companies. It's really good at that.

Starting point is 01:38:21 Now, I would say what LLMs are poor or they don't have the a strong ability for is um in general they they don't reason like we do yeah so if the source material if the context is rich enough to to kind of piece together and give clues on on what the answer is it can both kind of pull from that body of knowledge, but then it can also synthesize information, which maybe is not clear from the very large, like they feed these systems with hundreds of billions of words and like huge unstructured document sets. And so it does kind of these amazing leaps which which seem like reasoning but they're not actually reasoning as humans think and reason and um also at a related

Starting point is 01:39:15 thing which they're poor at is is math right so in general like it's not you know they work as simple calculators and can answer basic math questions because again like that's all exists in the source material or they can specifically train the models and add in things for common questions which get asked so they they do a good job of calculating and returning a result via going to an agent or some other system which is specialized but in general they're not designed for math and if you give them a complex problem like I was playing around with one of these kind of online LM games where you're supposed to trick and

Starting point is 01:39:50 hack the LM. Oh that's a thing? And it was kind of fun like they set it up as the LM thought it was a wizard and like it was protecting like some secrets and you were supposed to like convince it to protecting like some secrets and you were supposed to like convince

Starting point is 01:40:06 it to give you the secrets and um basically they had multiple levels for difficulty where they add additional prompts which would prevent you from doing attacks which um will allow you to circumvent the lm but basically the way to the way to hack it is you did a combination of reasoning and hard math problems. So you'd ask it to, for example, do like a rough 13 algorithm or something complex, moderately complex,

Starting point is 01:40:36 and basically you ask it to give you an answer, apply a math algorithm to it, and then the system which is checking the answer to make sure it's not revealing secrets now can't check the answer properly because you've encrypted it ah i see but when you look what you learn when you keep giving it harder and harder and harder math challenges is um it actually falls apart like it for example like rotation ciphers it'll consistently get the first few letters right it gets worse and worse and worse as it goes

Starting point is 01:41:06 along because it gets lazy and it doesn't really care about the answer. I noticed if you tell it it's wrong, it'll get better. It's almost kind of like reinforcing an animal or something. It's like, oh, I'm so sorry, you're young, it'll get a little bit better.

Starting point is 01:41:20 And then the ultimate answer was to actually give it small snippets of code to generate or compile. Because again, the web is full of so much code. And those systems are also tuned for doing a certain amount of code challenges and problems. So the hardest reasoning problem you can give an LM actually is to do coding. So you combine that with like asking for some information or like trying to hide some information. You can actually trick it to do quite amazing things oh interesting so okay so you know i think that yeah when people

Starting point is 01:41:51 think of lms a lot of people have used chat gbt perplexity these other things and that's a pretty straightforward um from from like a product standpoint use case where you go to chatgbt.com and you're literally just a blank screen with a text box. You type what you want. And so it's the simplest product you can make. And it's because their technology is really impressive and they just want to focus on that.

Starting point is 01:42:17 What are some other places, you know, maybe less obvious where LLMs are being used in the real world? Yeah, so I was mentioning things which lms aren't good at and an additional one which lms are not good at is um since they're trained on mostly public data they really know nothing about enterprise systems right nothing about ask it about your email things in your email for example yeah yeah and like similarly if you're if you're in a company which does um you know supply chain management of like parts and things for aircraft you can't ask it about like what's the part number or what's the what what parts do you

Starting point is 01:42:57 need for a certain maintenance operation on a plane because that's just not something that's generally available to to lMs to be trained on. So one of the techniques you can do with LLMs is you can do something called retrieval augmented generation, where you feed in a body of this additional knowledge, additional information from, could be from a database, from a bunch of documents, from some other source which is not public. You embed it in a vector database. And then when you query the LM, you first query the vector database, the vector store.

Starting point is 01:43:39 You ask it for information which relates to this. You pass that on to the LM as context. And now the LM, which is very, very good at answering abstract questions, now becomes an expert in this new data set and this new knowledge set. But with kind of the same limitations

Starting point is 01:43:57 where it's not good at reasoning, it's not good at a whole bunch of things. And this produces another problem, which is, so getting back to the the aircraft example let's say let's say i'm a technician i need to perform maintenance on the the fuse lodge of an aircraft and i want to know like for for this particular aircraft model what what part number do i need to do this repair which i'm trying to execute. Now, if it's not exactly in the source

Starting point is 01:44:26 material, if maybe the relationship between the maintenance operations and the parts isn't clearly spelled out, the LM won't say, oh, I don't have that information. You should maybe talk to somebody who's done this before. What it's going to say is

Starting point is 01:44:41 it's going to say, oh, well, if you're doing this sort of operation, there's a number of different things which you could use for this. And I'll give you a list of different parts which are totally plausible. And I'll tell you any of these would work. Just go ahead and... Oh, man. I think you're explaining a lot about like Boeing doors falling off airplanes.

Starting point is 01:45:02 I hope they're not using theMs for their maintenance on Boeing planes. But you can see that it's for enterprise use. You can actually get pretty far with an LM, but it has problems with what we're currently referring to as hallucinations.

Starting point is 01:45:20 Basically, the model will extrapolate information, which may or may not be true. And depending upon how good the source information is, how good the encoding is, you can get to some customer support systems, they get to maybe 60%, 70% accuracy or accepted answers

Starting point is 01:45:40 when a customer support representative looks at this and says, is this the right answer to give a customer? But the question is, when you get to that accuracy, let's say you get to 80% or 90% accuracy, is that good enough for aircraft maintenance, for supply chain management, for fraud detection? There's a whole bunch of critical use cases. Is that good enough? Is that accuracy good enough? and then the second question is if when it's wrong how do i know it's wrong how do i

Starting point is 01:46:10 how do i check the system how do i explain how it got the results so um this this is actually the subject of my talk which is there is there's another way which you can um encode enterprise information, which has been in use for a while. It's called knowledge graphs or property graphs, basically the same thing. And a lot of expert systems will use knowledge graphs as the system of truth for capturing data, kind of building expert systems. But the typical problem with those is to build an expert system on top of a knowledge graph, you need an interface. So you need to build an application,

Starting point is 01:46:52 you have to have a bunch of queries and drop-down menus and things for people to find information. So wouldn't it be nice if you could ask the LM, but then have it give you information from a knowledge graph instead of just randomly pulling information out of a vector store and so this technique for pairing knowledge graphs and lms together is called graph rag it's a really effective way of improving the accuracy of your results there was a study recently done i believe by by gartner where they

Starting point is 01:47:25 showed a 54 increase in accuracy just by switching to knowledge graphs versus like traditional vector stores oh okay um it makes it easier to explain the results because part of the knowledge graph gets passed into the lms context and knowledge graphs unlike um vector databases actually you can you can reason about them you can see what are the nodes or the actually you can reason about them. You can see what are the nodes, what are the relationships. You can actually start to understand why the LM was giving a correct or incorrect answer and then go back to the source data and fix it. So is the LM translating your English request into a knowledge graph query? Is that how it's working?

Starting point is 01:48:02 Or is it just more fundamentally integrated with the knowledge graph? Yeah, so it's doing a couple things. So when you take the source material and you put it inside of a graph database that also supports vector search, it's

Starting point is 01:48:18 doing a standard vector encoding using word vectors of the embeddings. It's also using an LLm to create a graph and you can you can you can hand make a knowledge graph you can also tweak the knowledge the resulting knowledge graph but a really quick way of doing this is to actually use an llm to generate the knowledge graph off the source material as well and once you have a knowledge graph plus the vector database and linkages between them, now you can

Starting point is 01:48:47 basically feed it into the vector store, the question, get back some embeddings, see what nodes they're associated with and pull the related nodes, and then feed that all as context into the LLM.

Starting point is 01:49:04 And so basically it's a better vector search because it's vector search which includes knowledge graphs and real data coming from the lm yeah that is super cool okay so um to have folks deployed graph rag uh you know different industries and and tell us maybe a good story, bad story kind of thing. What's something maybe hilarious that came out of it? And what's also something where at scale

Starting point is 01:49:36 you had an aha moment or a eureka moment? I think a lot of our customers are using GraphRag and graph databases specifically to solve the problem of getting higher accuracy on RAG. Some of our customers are using it for, in particular, customer service systems.

Starting point is 01:49:55 That's one of the common scenarios. A second one in general is recommendation engines. Oh, that makes sense. So trying to give back better search results, better recommendations to end users. And actually, we had an interesting use case

Starting point is 01:50:12 where this is a different use of LLMs. It still generally falls under GraphRack, but it's more of a research system than it is a query system where um one of our customers um maintains oil fields and kind of that oil infrastructure and um you know making sure that the the supply chain is uninterrupted is important for them

Starting point is 01:50:42 but there's so many data points in terms of conditions and maintenance issues and weather conditions that it becomes very hard to even understand root cause analysis on why things are getting delayed or slowed down so what they did is they they fed huge amounts of data into an llm using using a using a graph representation of this, using a graph store, and then had the LLM start to reason and give some potential answers about where the issues were with outages

Starting point is 01:51:18 or supply chain issues or things in there, and got some interesting insights. Now, it's a humongous model, very slow to load up the massive quantities of data and things which they did. But from a research standpoint, they got some really valuable insights, which would have required a humongous amount of human manpower and research to actually go through the data and build those insights. So I think there's a variety of different use cases for it.

Starting point is 01:51:46 And I'd say the biggest challenge, and those are all successful production-ready systems I talked about, but the biggest challenge in the white elephant in the room related to LMs and, in general, Gen AI architectures, is it's all startups and very little is in production. So when you actually get down to it and you're like,

Starting point is 01:52:11 okay, well, how many people are using Langchain or Alama or these models in production? And they're like, oh, we have a really promising system and we're getting the accuracy up and we're almost there. But I think a lot of folks are almost there and are kind of looking for the technology to mature.

Starting point is 01:52:35 And also for the cost to be reasonable as well. I think that it doesn't... Doing research and doing development in LLMsms it makes sense because the technology is not that expensive to prototype with when we were doing at scale large volumes of data a huge amount of resources and processing and gpus which you need to to then execute the

Starting point is 01:52:59 the building of knowledge graphs or in general building of RAG and search on RAG, that can be quite expensive. So I think also enterprise use cases are a great place where the benefit-cost trade-off makes a lot of sense. If you're just doing like this for a general consumer system, you don't want everyone just firing off queries randomly unless you have a huge amount of capital like OpenAI. But if you're in a corporate system where it's customers doing queries

Starting point is 01:53:29 and they're solving business problems, then it makes a lot of sense that I can pay the token fees and everything, but then it's saving me and my customers time and money because we're able to get answers faster and easier than we would if we were going through a human workforce that makes sense and so just uh kind of one last question on the theme of ml ops you know if someone wants to uh you know go into the ml ops kind of profession um you know what is your kind of uh best advice for them um you know let's say they are um um

Starting point is 01:54:08 you know they they're they're just finishing high school and they have a choice between you know getting a four-year degree going into a boot camp should they try and get a degree really kind of specific on on uh you know uh um it ops or just a you know general computer science or even a general math degree how do you feel like folks should kind of navigate that yeah no i think that it's there's there's a lot of great options for like technology degrees um just despite what what folks say including shulmi benheim in the keynote stay SwampUp, developer jobs aren't at risk. I think that AI technology continues to build and create new opportunities and new technical challenges

Starting point is 01:54:55 that you need really smart people to solve. And my best advice for folks would be the real skill set you need to learn in college is how to solve hard problems. And so if you feel challenged, if you're in a degree program or if you're taking something where you feel like

Starting point is 01:55:16 you're learning new material, you're able to use your expertise to solve problems, to reason about things to to make a difference that skill set the the ability to understand pick up a new challenge reason about it lms don't reason humans do right and and actually come up with some creative solutions that's what's going to be the valuable skill set going forward. And maybe we're not going to be sitting there

Starting point is 01:55:47 and coding basic algorithms for doing list sorting and all that stuff in the future. I mean, hopefully people aren't doing that other than like programming one-on-one because machines are better at that and they're better at optimizing those things than we are. But hopefully we're the ones who are actually taking the real world problems

Starting point is 01:56:05 figuring out how to solve the hard problems and actually ml ops is a great example of this because figuring out how to observe how to secure and how to make sure that machine learning models are being taken from the developer all the way through to production is a hard problem and a machine isn't going to solve this for us this is something which you need people who understand the problem they understand the space they can do it actually even my talk here at SwampUp is about using one of the tools in that tool chain Artifactory which you can use as a model repository pulling information out of it feeding it into a knowledge graph using the Neo4j Knowledge Graph Builder.

Starting point is 01:56:47 And in a couple hours on a weekend or an afternoon, you can basically have your own LLM with enterprise information from your machine learning pipeline and then start asking questions like, oh, which dependencies have an MIT license on them? Or what's the latest version of this library? You can start asking your own little system

Starting point is 01:57:12 these questions. Cool. So I think that it's an exciting time for folks who are in technology professions because if you're able to learn to reason and to create and understand new problems and new challenges, then you'll have pretty much work no matter what field or degree program you go into. totally agree um cool it's awesome having you here um at at a swap up here at austin and uh i look forward to chatting with you later on thank you so much for uh entertaining the folks yeah no thanks for having me on the show and i'd love to join the um discussion on discord as well

Starting point is 01:57:56 cool that'd be great we'll look forward to it Thank you. And share alike in kind.

Programming Throwdown - 176: MLOps at SwampUp

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.