Tech Over Tea - Docker Model Runner Is Awesome | Eric Curtin

Starting point is 00:00:00 Good morning, good day, and good evening. I'm as always your host, Brodie Robertson. And this is attempt number two at getting this show to work because we had some, we'll just say technical difficulties before. So, we only lost a minute of the episode. We'll just redo the introduction. Who are you? Tell the people what you do. Yeah, no, sounds good.

Starting point is 00:00:24 My name is Eric Curtin, so I work with Docker. I work on kind of Docker's main AI inferencing tool called Docker Model Runner. It's like a container centric way to run AI on any kind of machines using container technologies and OCI registries like Docker Model Runner for transporting AI models and that kind of thing. And also from just from the AI perspective, I work a little bit with, I work a bit with with Lama CPP, which is kind of like the premium AI inferencing engine

Starting point is 00:01:06 particularly on lower end hardware like desktop and edge devices. Well, let's sort of start at the core, like basic level and then work our way up from there. For anyone who's unaware of what Docker is and what

Starting point is 00:01:21 function that is serving, how about we just start there and then build on top of that? Yeah, sounds good. Yeah, so Docker is famous for being a container technology. It's a way of spinning up lightweight sandboxes to run all your pools in some sort of user space environments. That could be Ubuntu, Fedora, CentoS. It could not be a distro at all, literally just an application. But there's also a part, the way we use containers.

Starting point is 00:01:57 The way we use containers is we normally pull this application or user space, whatever you want to call it, from some centralized repo. The most famous one, I guess, is Docker Hub. But there are many others. There's like GitHub Container Registry. There's like Quay.io, and the list goes on and on. And the other place, those are examples of kind of public

Starting point is 00:02:27 registries. There are also many private OCI container registry, so it wouldn't be uncommon for some kind of a business to set up these container registries in their internal data centers, so they don't always have to be pulling everything over the public internet, so they save some cost there, and there's security advantages, right, because you're not reaching out to the public internet all the time. Right, and if you're building something that is only locally important. You wouldn't want to put it, you know, out on the public internet. Yeah, exactly. Yeah. And it's probably faster to pull in the internal network. Right, right, right. But this has become appealing to AI people in recent times. And there's a couple of

Starting point is 00:03:13 examples of people doing this. It's not just Docker model runner. These OCI registries are normally built on very good infrastructure. And they're normally capable of like transporting like, huge amounts of data, like gigabytes, you know, which tends to be the kind of sizes that AI models are. So we kind of, everything is a file, right? So we kind of use this format. And rather than pushing and pulling containers, we actually push and pull models to the exact same infrastructure. And that's very appealing to people, because at the end of the day, when you're pulling a model, you just need to pull a couple of gigabytes. So, I can just reuse their existing infrastructure to do the exact same thing just for AI models.

Starting point is 00:04:04 So when you say model, what is an actual AI model? What are we looking at as this thing? Obviously, it's a very complex topic. The entire episode could just be explaining. It's very possible to simplify the explanation. Like, there are different model formats, but let's just for the second argument, we'll take the one used by Lama CDP and Docker model runner called GGUF. It's very simple. A model is a file, which contains all the AI's knowledge, how it works, its architecture, everything.

Starting point is 00:04:43 So examples of models will be like Lama from meta. Google have one called Gemma. There was a famous one, a famous open source one that got popular. around a year ago called deep seek r1 and it's very simple it's a file you pull and that's the brains and knowledge everything of that model in one file and you basically load that into your inferencing engine and then you have um your kind of chatbot or it could be kind of multimodal AI which is like feeding the A model images or video or sound that's kind of going on also so where does docker model runner

Starting point is 00:05:27 sort of fit into the equation here? What is it actually doing? Yeah, that's a good question. It's doing a number of things. One of the things that kind of used to be quite difficult in the early days of AI was just configuring your hardware set up.

Starting point is 00:05:46 Like, how many layers do I want to send a GPU? Do I want to use Kuda, Rackham, Vulcan? there's a whole list of kind of GPU level abstractions you can pick from and configurations you can do. And that actually gets quite complex and it can be a bit hardware independent or hardware dependent. So one of the things it does, it kind of abstracts all that complexity away from you. So you can run open AI compatible servers, which is basically, the gold standard for AI inferencing. And then, yeah, you can spin up your local LLMs

Starting point is 00:06:32 to do like chat GPT type things or other things. The other thing that's kind of appealing is there's certain registries that have become quite popular for pulling AI models. One example would be Hugging Face. Oh. Uh, did I, did my Discord crash? What happened here? Discord? Are we having tech issues on my side now? Okay, um, give me a moment. Give me a moment.

Starting point is 00:07:10 Yeah, okay, my entire Discord crash. Lovely. Discord decided to crash, but we should be, yeah, we're recording again. Okay, so, you're talking about how there is a popular place to get AI models, that being, a hugging face and that's where I disconnected. Yeah, so hugging face is a very cool report. It's like one of the main ones from pulling for pulling AI models like meta and all the company's interest in AI. That'll be normally

Starting point is 00:07:42 one of the places they will store models. Even though Docker Hub's starting to get popular as well but that's kind of a new player. One of the things about hugging face, as far as I know, and I could be wrong because I don't work for Hugging Face but typically the only way to pull from Hugging Face is over the public internet so one of the things I at least I think is cool about Docker Model Runner is we

Starting point is 00:08:10 you can pull from Hugging Face but it like immediately stores it in kind of a container like format so once you get once you pull it via Docker Model Runner you have freedom then because once it's in that format, you can push and pull it to kind of like anywhere, any of those source via registries. Right. So you sort of, you bring it in and then it's sort of, you're not directly tied into that, that hugging face sort of ecosystem anymore. Yeah, exactly, exactly. And you might always want to pull from hugging face, but that's fine.

Starting point is 00:08:48 And that's fine if you do, that you can do that with Docker Model Runner, but we give you the freedom to push and pull to other. So, Hugging Face is like the main one. Are there other ones that people do tend to get models from? There are, um, there's another popular place, but I'm sure many of aware. I'm sure many people are aware of. Like, there's one called Olama. Olama can be a little bit controversial because one of the things they've done is GGUF is kind of standard created by LAMCPP. And registries like

Starting point is 00:09:33 Hugging Face and Docker Hub just use vanilla Lama CPP to be friendly to the community and so we're all on the same page and all reading from the same hymph sheet and you know combining our engineering effort it's having some sort of standard. I tend to steer clear from Olama in conversations because Olamo also used GDF, but they've changed

Starting point is 00:10:05 the format of GGUF recently. So it's only compatible if you use the Olama engine, which isn't kind of very community friendly, but look, that's the way they decided to do things. So I typically not to use it as a great example. Okay, no, that makes sense. So there's sort of this, I guess, over time, was it a standard that was sort of developed naturally,

Starting point is 00:10:34 or was it a standard that was, like, provided? Like, how did this format that people are using actually come about if you happen to know the origin? I do know the origin, because I know the creator. This standard was one of those. things, there wasn't like really a standardization committee or anything like that. There's a guy, Georgie.

Starting point is 00:10:56 He has a really hard to pronounce second name. But Georgie's a genius. He's the creator of Lama CPP. And his initials are GG. So

Starting point is 00:11:09 so, yeah, GGOF, the two first letters are Gigi. I'm sorry. Sorry, if you happen to see this, I don't know how to pronounce your name either. Yeah, Georgie Gorgonov, let's call it that. It's funny because I know the guy.

Starting point is 00:11:27 I should have really asked him how to pronounce his last name. But yeah, he basically created it for Lama CPP. And Lama CPP is a pretty amazing project. It's a bit technical now. That's, it can be a bit hard to use, even though it's improving. But it has amazing hardware support. and that's basically the file format they came up with and it's quite useful actually and it's quite simple because they just put everything into one file that you download so it just kind of organically

Starting point is 00:12:02 kind of became one of the standards it's not the only one but it's one of them yeah everyone just kind of agreed it kind of makes sense to have a standard that everyone works with rather than just all doing these weird different things than having this compatibility nightmare where no one can really share things around. It makes a lot more sense from an engineering perspective to just agree on this basic thing

Starting point is 00:12:25 and then try to differentiate in other ways so that the core functionality is still there. Yeah, exactly. And there are other competing model formats which are just as popular. Like there's another safe tensor form

Starting point is 00:12:43 which is typically used of VLM. But the reason what GGUF is really popular is for Lama CPP usage. So that's kind of the standard for, let's say, desktop and edge devices right now. Okay. Okay, that makes sense.

Starting point is 00:13:01 So you're mentioning the Lama CPP. What is that project specifically? Yeah, so Lama CPP is kind of like an inferencing engine. So it has an underlying library called GGML. That's kind of a very low level cross to the hardware AI library.

Starting point is 00:13:24 Let's just call it that. ML is for machine learning. And LAM CPP builds on top of that. And it's kind of focused on chatbots. But actually that's changing. That's kind of no longer true because now they're adding multimodal support.

Starting point is 00:13:43 which takes in all sorts of other kind of inputs like images or whatever um so yeah that's kind of the funny thing about lamma cpp like it was initially written to run lamma model from facebook now it runs every single model but you know projects grow and about um and going back to my point it is a really technical tool so often a lot of people use it like a library uh docker model runner would kind of be one of those. Endo Lama would be one of those, actually, at least in the past. And there's a bit of a story there.

Starting point is 00:14:19 But I kind of see it has kind of like the meza for AI. It's kind of like this low-level library that everyone goes to for AI. And it also has all those GPU back-ins integrated like Vulcan Rokam, Kuda.

Starting point is 00:14:35 There's another one called Moza, OpenCL. There's a lot. So that's sort of the the low-level library that everyone builds on and then sort of you bring in the tools that build on top of that will pick and choose which functionality they want to provide within their tool

Starting point is 00:14:51 but I guess kind of in a similar way in the video editing space to something like an FFMPEG where like pretty much every Linux video editor every Linux video player for example is basically just a wrapper around FFMPEG at its call for the most part

Starting point is 00:15:07 yeah no that's pretty accurate Mm-hmm. Mm-hmm. Now, obviously, I'm sure there's things that I added on to that, and that's definitely important to talk about here. So, what, one of the things that you did want to talk about is when Docker Model Runner first came out, there was, well, there was, the state it was in at the time, and I guess a lot of those opinions have kind of held on about what it's like and what is currently present within the project. So I don't know where you want to take this and what you want to focus on first, but I guess we can get into that. Yeah, no, this is one of the other reasons I wanted to speak with you. We're kind of doing sort of, let's call it, a relaunch of Docker Model Runner at the moment

Starting point is 00:15:57 because when Docker Model Runner was first released, they were kind of racing to get something out because they had all the ideas and they kind of just strung it together with paper clips and whatnot, just to get it out there. So when they initially released it, it was beta. It was only available on Docker desktop, which is a proprietary tool. Well, it's kind of half half. It's about 80% open source code and 20% proprietary. But anyway, it was beta and it was very limited.

Starting point is 00:16:38 in hardware support it only worked on like apple devices and in and video hardware it didn't have things like vulcan to support every GPU in the world basically and all thoughts of things like this so that was a couple of months ago actually became fully open source only like a month after that but i i don't i think we could have communicated that better but now we're kind of trying to reboot and relapse the community. So we went GA like a few weeks ago. We've been cleaning up the GitHub repo to make it more contributor friendly. We upstreamed all the patches to LAM a CPP.

Starting point is 00:17:21 We enabled lots more hardware. Tomorrow, Vulcan support is coming out. So that's, yeah, Vulcan's really nice because, yeah, you throw any integrated GPU. or AMD or whatever and it works with relatively little configuration or whatever so I suppose we're trying to

Starting point is 00:17:42 relaunch with a more open centric attitude and you know try and build a community and really make it an open source project that people feel welcome to contribute their ideas too I guess

Starting point is 00:17:56 so so was it just you said it was sort of just like held to go to the paper clips initially It was just they wanted to get something out as quickly as possible. Was it, what was the sort of environment like at the time and why was there this effort to rush it out as quickly as possible? That's a good question.

Starting point is 00:18:17 And I'm actually probably not the best person to answer that because I wasn't part of the team at the time. Fair enough. But what I've gathered from the team is they felt there was a lot of kind of tools operating in this space. So they felt like they had to get, out there quickly and I think the other thing was like just to

Starting point is 00:18:38 gather some initial feedback so they wanted to like do the bare minimum minimum even if it necessarily didn't have all the features and it wouldn't fully ironed out just get feedback from the community on places like Redis or whatever and work

Starting point is 00:18:54 from there I think that was the idea okay okay so what is I guess what is the state of it now if someone wants to go and install it today and try it out, what are they going to be presented with?

Starting point is 00:19:10 So, yeah, there's two ways of installing it. You can install Docker desktop, which, as I said earlier, that's kind of a product, a proprietary tool, but it's free for personal use, so that's one way. The other way of

Starting point is 00:19:25 installing it, there's something called Docker Engine, which is like the command line version. That's fully open source. Sometimes people call it Docker Community Edition.

Starting point is 00:19:38 So that's another popular way of installing it. I actually install it both ways depending on my machine. But yeah,

Starting point is 00:19:47 you're kind of, you're presented with a Docker like CLI syntax that people would be very familiar with. So like to run,

Starting point is 00:19:57 say, Gemma 3 from Google, it's literally Docker Model Runner, AI slash Gemma 3. Docker model push, Docker model pull, Docker model inspect, Docker model lists to list all the AI models you've downloaded into your local storage. So it's that kind of concept.

Starting point is 00:20:18 So if somebody is familiar with Docker already, it's basically just everything they already know. Yeah, exactly. And also because it runs, it runs an open AI compatible server. There are also a lot of UIs that are getting popular. So you can connect it to things like anything LLM, OpenWeb UI.

Starting point is 00:20:43 There's all these UIs that you can connect. And they sometimes give you a lot more functionality in terms of things like RAG. Rag is basically kind of like using AI with documents to supplement its knowledge. So yeah, that's another way of using it. and you just connect them together, and that can be quite useful. So you mentioned the Open AI server there.

Starting point is 00:21:09 I guess that's just the standard because they kind of, you know, they're like the big name early on. They kind of got to set the standard because they were the standard at the time. And there's a lot of tools that were built then sort of targeting what they were doing. So you kind of have to work around the standard that's effectively developed there. Yeah, yeah, this goes back to the, yeah, you're exactly right. This goes back to the point of, we were talking about the standard for the file format for

Starting point is 00:21:41 models. Yeah, chat GPT got popular and people were like, I guess that's the standard now. And that's the way things are kind of going in AI, it's moving so fast and something just gets popular and people are like, let's just go with that. Yeah, you are definitely right. Things are moving very fast. even, like, I don't have like a super close, I'm not like paying super close attention to things that are going on, but just watching from the outside and seeing, like, I saw an example of, I don't know, it might have been like mid journey or something from two years ago. And it was generating a picture of a dog on a skateboard. And I remember how bad the images were back then. You would see skateboards like clipping into things heads. They would, it was like these times. It was like these. tiny, tiny images, and where we are now, it is a night and day difference.

Starting point is 00:22:35 And I don't think anybody, I don't think anybody even paying attention to this space knew how quickly things were going to evolve. Yeah, the ball is rolling really quickly. And you even see people like enhancing AI with the use of AI. So, yeah, it's, it's moving at a, yeah, incredible pace. And I don't see it slowing down anytime soon. It might eventually, but I don't see it right now anyway. Yeah, every time I hear that we're hitting some limit, some things happening.

Starting point is 00:23:09 Like, it's, I don't know. I don't know where we're going to be. I don't, I'm not in the, in the camp or I'm going to make any sort of predictions at this point. I didn't think we're going to be here this quickly. I don't know where we're going to be six months or a year from now. We're at the point, I'm sure you've seen the recent, um, demos with SORA, where people are generating all manner of things, but a lot of, if it's a context where the video would be kind of low quality anyway, like, you know, security footage, or a doorbell webcam, like, I can't tell the difference in a lot of cases. It's gotten, like, it's scarily good. Yeah. No, I completely agree.

Starting point is 00:23:57 Those are kind of the use cases where Docker model runner can be quite useful, actually. We call those edge devices. That's kind of Lama CPP specialty. Because if you want a super accurate knowledge, the knowledge of like 20 PhD student, if you want that crazy level of knowledge, you're going to go straight to a data center, with the crazy, beefy entity at GPU, and you'll probably use something like ChatGPT.

Starting point is 00:24:29 I don't think Docker Model Runner is quite there yet, but for like edge devices, like the doorbell case with the webcam like identifying objects, we can do that with tools like Docker Model Runner now, which is kind of a really cool use case I find.

Starting point is 00:24:44 I quite enjoy it. So what are people actually using tools like this for? Because obviously when you're running a local model, the scale of what you can do with it is considerably less than if you have, you know, however many GPUs open AI happens to have at this point. This is a good question. I was only talking about this to the team, to the team recently. What are people using it for? Sometimes I find that hard to tell. Obviously, we have some telemetry we can gather via Docker Hub. I think last week we hit a record.

Starting point is 00:25:22 there was like 300,000 model pulls last week and a lot of them were Lama Lama 3 from Facebook for meta, sorry, yeah, I still call him Facebook I'm stuck in the past, but that's one of the reasons I'm trying to build a community. We can see lots of usage, but we don't really always know what people are using it for, to be honest because there's a lot of Docker model runner it's it's just a tool right it's like

Starting point is 00:25:57 there's previously there so that's one of the reasons I'm on the call we know it's being heavily used and we're not always quite sure what the usage is but if we can build a community and get people to contribute features that they care about and open issues about and describe their use cases that would be very beneficial so my honest answer is I'm not quite sure what about yourself like what do you use these tools for um i use it for all sorts of things um i i find the i i code review tools quite useful i turn that on for all my repos i create these days because the ai code review tools i find they often pick out things that that a human would just miss so i find that very useful i have used ai to write code sometimes um

Starting point is 00:26:50 but I'll be honest sometimes AI writes code for me and I'm like that's a lot of crap I don't know is it hallucinating or whatnot right but there are other times I was like oh my god that's perfect and I've ended up using 80% of the solution with lit lead edits um I read a lot of blog posts I used to tell print blog posts sometimes um even sometimes email um so yeah even though English is my native language uh sometimes I sometimes I I find it can be more articulatomy. Code view I do want to sort of touch on. I hear obviously a lot of code generation stuff, but what is it doing on the code review side? Is it sort of identifying problem areas? Like, what is it, what is it act? Like, what's an example of some way you've used it?

Starting point is 00:27:45 Yeah, so like, yeah, everything's just text at the end of the day, right? So like if you just take a patch, yeah, it, it knows the number of lines added, number of lines removed. You can't even use things kind of like, I guess we would call this rag to give it knowledge of your whole Git repository. So, yeah, everything is just a prompt that we call them prompts in the AI world. So everything is just a prompt at the end of the day. So the prompt would literally be, like, review this code, you know, critique it, point out any issues with the patch. And, yeah, it does it, it does a thing, yeah. How often do you find that actually helps you with what you're trying to do?

Starting point is 00:28:43 In terms of the code review, answer. Yeah, in terms of the code review side. I think it helps me daily One thing I will say about the code review bots is it often will leave like let's say there's a thousand lines of code It will often leave 10 comments Sometimes they're all pretty much

Starting point is 00:29:06 Us because it misunderstood Stans the code Sometimes 50% I'm like Yeah that's right I should really you take that into account and change the code. And sometimes it's 100%.

Starting point is 00:29:19 And sometimes people laugh at AI code review tools because they're kind of like, oh my God, what's stupid feedback? But for me, that's not the point. Like, it's pointing out areas that you should consider looking at. You can ignore the feedback if it's crappy, but genuinely there's lots of cases that I've seen it point out pretty critical things that I believe would have been missed by a human.

Starting point is 00:29:46 Hmm. Okay. Um, yeah, this isn't, like, I don't tend to use much in the way of this tool. And the only, for me personally, the only AI tooling I frequently use is I will generate, um, transcripts my videos and then modify them by hand when it misunderstands me because Australian accent and it thinks I'm saying words weirdly. Um, but usually it's at the point where like 90 plus percent. of what it generates is totally fine and I just need to clean up a couple of things where it's like oh you said Gnome did you mean genome? No, I didn't

Starting point is 00:30:26 mean that actually but some other times it gets it totally fine and yeah like I can see why people are gravitating towards these tools I understand the concerns that people have with them but I can see

Starting point is 00:30:44 how when you are using them in a way where it is augmenting the work you're trying to do, how it can actually benefit you? Yeah, I always view them kind of as accelerators. Like, they're not really replaced the work, but they can make you get it done a little bit faster. And the problem you spoke about there, I won't go into it because that's a detailed topic in itself, but the hard G and Gnome, which is 100% correct. there are ways to solve that problem as well

Starting point is 00:31:16 you can do something called training which we've literally been doing with Docker Model Runner recently so you can kind of coerce the model and you can basically teach it that this is wrong this is the more correct way and that's often done to fine-tune AI models

Starting point is 00:31:33 yeah training is something I've heard about but like what is the actual process of training a model like how how does that sort of go i'm actually not an expert of that i tend to just use the generic models as it is but a teammate of mine's been working on a quite recently he wrote a blog post on blocker dot com he was using tools from this cool layout startup called uh unthought with with docker tools yeah i'm i'm not an expert on that i'll be honest but um you also need need GPUs for that actually as well um to kind of drill the new knowledge into the AI model

Starting point is 00:32:22 um i didn't really want to go there but uh that's actually one of the cool things about docker desktop uh which is a proprietary tool if you have a machine that does has any GPU whatsoever or they're really weak, you can turn on this feature called Dr. offload. And once you turn it on, it's pretty transparent. But you're actually using remote GPUs in a data center. And I actually think that's really cool if you don't want to like invest in a beefy GPU in your house. Now there's I now obviously that's a paid offering, but I think there's some free usage if you just want to kind of try it out. You get some many credits or something.

Starting point is 00:33:07 So on the note of GPUs, I know obviously different model sizes are going to require different, you know, different sort of levels of GPU to effectively use. But I guess what would you really need to have in your system to effectively use Docker Model Runner and some of the models that people are commonly running? this this is a this this is a good question um honestly it's kind of like how long is a piece of string

Starting point is 00:33:39 because i mean you could run docker model runner on like a raspberry pie if you wanted and run really small models and they wouldn't be the most intelligent um or you could get like an invidia h 100 which i think costs 30 000 u.s or something and run some of the most intelligent models in the world. So it really depends on how much you spend. It's kind of as simple as that, to be honest. But like on a MacBook, you can run a decent amount of models with good quality, yeah. Or on like an 8 gigabyte V RAM into the HP, you can go pretty far.

Starting point is 00:34:26 That's another reason why Docker offload exists, because sometimes it's just not. work like buying a GPU and if you can just have a few minutes of a power for the GPU here and there as you need it, that can be very useful. Mm-hmm. Yeah, GPUs are expensive. GPs are expensive, especially, like, GPs are expensive enough, and then when you want to do a lot of GPU compute, then, you know, then you get into, yeah, big money. But it's not as inaccessible as you think, like, um, you can still do a lot with consumer

Starting point is 00:35:02 GPUs. I have like an A&D GPU here it was a couple of hundred dollars and I have a MacBook Pro and you can you can do most things just fine with that level of hardware. So all of your like code review stuff that's just being done

Starting point is 00:35:18 locally or are you using some online tool for that? Yeah, when I brought that up actually I'm not actually using Docker Model Runner for that typically. I understandable. Yep. They're like I was just describing you could use that from a runner for that

Starting point is 00:35:33 but I personally don't. The reason I don't is there's kind of, we use GitHub for all our repos and there's some really great plugins that are free from MotenSource. There's Gemini Codicist and there's another one called sorcery.ai

Starting point is 00:35:49 so that's very like hands off because you basically turn it on and it does its thing as random contributors open requests. You could use Docker Model Runner for that if you wanted to be more privacy focused and run everything locally.

Starting point is 00:36:11 Yeah. Well, that actually takes us into a good segue there. Why are people wanting to run models locally when you can get so much extra power from, you know, relying on these big GPU farms that are available out there? yeah so there's a couple of reasons for that like one is cost is sometimes it's just cheaper to to run things locally because obviously when you're running from a cloud vendor they're making a profit right so sometimes it gets to a point where it's just cheaper to do it yourself and especially if you're running um like if you're not running the huge models and you're

Starting point is 00:36:50 running medium to low end models um you want you want to run them locally the other one is latency. You brought up the, you brought up a great example there, and I love it because I brought it up in so many talks of, of like the doorbell with the webcam. That's the kind of thing where you really want local AI, because you're dealing with a stream of very beefy data from a video camera, and you don't want to be sending that back to a centralized server waiting for the response, that is the exact kind of use case where you want local AI because you want the latency to be super low. As if you were gaming, right, you want to, you want to late as low as possible. The other one is privacy. You know, sometimes you're using some like AI tool

Starting point is 00:37:40 in the in in the cloud and you're obviously giving it a lot of data. You're giving it a lot of sensitive data about your work or whatever and you know, some providers are very open we will use your data and some say oh we're not using your data at all but can you trust them really we don't know right but if if you're running it locally and you know you've you've everything containerized etc you can you know you can you can you can relieve a lot of those concerns well i know that um you know if you're operating a government context you can't say data out in certain ways or medical stuff like you there's like laws around how that data can actually be handled so if you're operating in a context like that being out of run your own

Starting point is 00:38:34 your own models separate from the big ones makes that an actual viable thing that you can use then yeah exactly yeah um like i know companies um that have turned and i'm not trying I'm not trying to bathe companies, but anyway. So do you know what? I won't name any company in particular, but I've known companies internally that they've turned on enterprise, chat GPT like things. And laterly, a month later, they've turned them off because they're like,

Starting point is 00:39:10 we think we might be leaking info here to a company we don't want to be leaking to. So, yeah, maybe that's paranoia. Maybe it's not, you know. You can debate about that all day. I don't know. no i can definitely like especially if there is big money attached to it like i can i can totally understand why you would be paranoid and you don't want to risk any possible chance that this data could be leaked in any way if there's you know fines attached to it if it's leaked things like that

Starting point is 00:39:41 like it totally understand why you want to be very careful with certain kinds of data you're handling yeah yeah it's it's it's only natural right uh at the end of day with AI models kind of like big massive databases with like huge amounts of data that can be used in some very powerful ways and sometimes you don't want to enhance that further with your own intellectual property or trade secrets or whatever it may be right right no i totally get that that makes a lot of sense okay um where do we go from here I don't even know. So, I guess we can...

Starting point is 00:40:28 Where do we go from here? I actually don't... I don't know. We've kind of just been, like, jumping all over the place throughout the entire episode so far. So, actually, I want to jump back a bit.

Starting point is 00:40:40 You said earlier that you want to do... You guys at Docker, I want to do, like, a proper, like, relaunch of Docker Model Runner. So, has that already happened? Is there, like, some plans to like sort of re-promote it or what's the go there? Yeah, in some ways it's already happened.

Starting point is 00:41:02 Like, it's been fully open source for like five months now. Did my Discord just crash again? What is happening with Discord today? Discord! What are you doing? Oh my lord This tool It's never been this crashy What is going on

Starting point is 00:41:34 You know, I think Discord might have pushed a bad update Maybe, yeah I don't know um okay so yeah we're talking about how the relaunch you went open source yeah just go from there yeah um i can't see you on camera at the moment but no it's all good we'll fix it as we go yeah um so yeah so in ways that that's that has already happened um it's always been fully open source from one month in but we've started trying to make it kind of be easier for people to contribute.

Starting point is 00:42:22 So one of the things we did is the project was split into many different GitHub repositories and some people were eyeballing the code to see if they can answer further, which we love, by the way. And they were like, oh my God, this is only a small portion of the code. The rest of the code must be proprietary. This isn't a real open source project. And the issue was that it was split into so many different projects.

Starting point is 00:42:47 that people didn't realize they had to jump between several several GitHub repositories to get the whole application. So one of the things we did was we've centralized everything to just one GitHub repository to avoid that confusion. We set up a community Slack so people can talk to us about features and whatever. And yeah, so we enabled Vulcan. I think Vulcan is actually important for accessibility because it means you're not, you don't need an Nvidia GPU. Whatever Nvidia GPU you have, you'll be able to use it and get some level of performance. So it's kind of already happened. The only one thing we're missing is we're going to do a blog post just to communicate all this.

Starting point is 00:43:39 It's always been the case. I just think we could have communicated it better because when it was launched, it wasn't fully open source. And then we made it fully open source. This is like months ago, it was made fully open source. But we kind of, we kind of forgot to communicate that in an effective way, I believe. Right. And those sort of first impressions are very important. Like, that's going to be how people see the project for a very long time and trying to change that, even if it has changed in the actual project a long time ago. you know, you've got to get people to reconsider what opinion they already held of a project, and that's really difficult.

Starting point is 00:44:24 Yeah, yeah, yeah. Vulcan is a good point. I think we got bashed on release. I think that's the top comment on that thread. No, yeah. Yeah, you know the thread I'm talking about, yeah. Yeah, yeah. No Nvidia GPU mentioned, no Vulcan, no Rockham.

Starting point is 00:44:40 Like, yeah. Yeah, so a lot of these, a lot of things. these issues have been resolved. It's just communicating that. And asking people getting involved, star, fork, contribute, talk that's on Slack if you're unsure of any of those things because we want to see people use it in interesting ways.

Starting point is 00:45:01 Because, funny enough, we have a lot of users. But, yeah, we brought up this point earlier, but we're not quite sure what people are using it for. And we'd love to build an even better tool, you know. Right. can understand what people are using it for, you can understand what areas to sort of highlight and try to try to enhance so that the users you already have are, you know, happy with where the project's going. Yeah, yeah. And the other thing I love is I love external contributors. I make

Starting point is 00:45:35 contributors that don't work for Docker because, you know, sometimes we might have meetings or whatever. We might be very close-minded about our goals. But I, I love seeing external contributors come into projects I work on because they come in with a completely different perspective and I'm like, oh, that's a really good feature. We never really considered that. Please go for it. That's one of those advantages of open source

Starting point is 00:46:00 isn't really discussed as much. Like people will talk about, oh, you can modify the code, you can fork and all this good stuff. But having a different perspective from the people who are primarily developing the project and sort of providing and also providing just actual real world uses for it because you might produce something

Starting point is 00:46:22 you might have an idea of how it's being used you might use it in a certain way internally but you realize that people outside are doing something you never even realized was something people wanted yeah exactly And the engine I thought, we spoke about earlier, Lama CPP is a typical example of that, actually. Because initially, I think the goal was to run Lama models on MacBook Pros.

Starting point is 00:46:53 And now it runs on almost everything. And it does those vision models that you were speaking about, like the doorbell webcam example. It even does that now. And that was not an initial goal of the project. So it just, over time, developed into the sort of everything project and, yeah. So, I guess, where do you see all of this going? All of this AI tooling going and tools like Docker Model Runner, like what, I know it's kind of hard to predict with how fast everything is moving, but.

Starting point is 00:47:37 Like, do you have any, any thoughts on where all of this is going? I could give you thoughts, but, um, yeah, it's, it's hard to predict as well, you know, this, this guy's the limit because the technology is so powerful and it's moving so perhaps, who knows, but, but one thing I've noticed is the kind of first version. I feel like we, we are kind of entering a new era because at the start, it was like chat GPT. it was just like simple chatbots and now we're doing things like image generation as you talked about that's the technical term for that is stable diffusion uh that's come on so much in the last year you have rag you have people throwing documents at at lLMs you have people um giving it video feeds

Starting point is 00:48:26 um or still images and processing data as input you have um you brought up the voice right recognition example. There's actually a cool open source project for that, Whisper, CPP, which I won't go into it, but that's pretty cool. What's really powerful is when you combine all those things together. We call that multimodal, which is really cool. Hugging Face had this little mini robot, actually. I can't remember their name, but it's really cool.

Starting point is 00:49:02 It's raspberry pie powered, and it's trying to do that actually. take various kinds of inputs, it has a mic, it has, you know, cameras, etc., and you can do cool AI stuff with that. And outputs are changing, right? Before outputs were just text, now they're like videos, images, sound, robots. So, yeah, it's really hard to say. One thing I have seen that's getting more popular is AI at the edge. the stature was all about user centralized server from chat gpd open a i or whatever but now

Starting point is 00:49:43 it's starting to become a little bit more popular is i keep on going back to your doorbell example because that's a really good example yeah it's it's bringing like a i closer to um the actual end user so you can achieve those low latency because when you get that really low latency you opened up another whole world of more kind of like a real-time performance. So I don't know, there's so many different things going on. It's hard to pinpoint one. I didn't even go into agents.

Starting point is 00:50:16 Agents is this whole thing that's exploring. Go right into it. Yeah, that's totally fine. Let's talk about that. Agents. Agents, it's a concept. Open AI released an agent toolkit like yesterday. I can't remember the name.

Starting point is 00:50:32 but we at Docker found out really interesting actually because agent kit agent kit yeah Docker has something called C agent and D agent which is this which is almost the same concept but we just released it a little bit earlier so we found out interesting those projects are also fully open source by the way but um agents is agents is kind of scary I used to work with this guy Dan Walsh and he used to compare to the Terminator movie and no the reason is agents is kind of given AI access to tools

Starting point is 00:51:09 so that might be giving it access to run things in a little sandbox or a container or giving it access to control a certain portion of your desktop so it's kind of like giving it hammers and

Starting point is 00:51:27 screwdrivers and things like that so yeah There's this really cool tool. I think it's called co-pilot agents. And you can tell GitHub to like, oh, just write me code about this. Oh, yes. Yes, I do remember hearing about this.

Starting point is 00:51:45 Yeah, it literally uses like the GitHub CI Bill servers, the Linux servers, and it'll start writing, testing code, checking if it's done things correctly and keep iterating. So yeah, it's rather than just like being chat put. chatbots with a human kind of going over and back, just leaving it run free and giving it access to tools and seeing what it does. Yeah, that's kind of the concept of agents. And Terminator is probably the extreme example, but you get the comparison. Yeah, let's hope less Terminator and more, what's a robot movie that went well?

Starting point is 00:52:29 That's a tough one, actually. Yeah, I keep coming with ideas and like, no way. Then the second half of the movie happens, no, we can't use that one. So I guess it takes the idea of, you know, we've had these like AI assistance for a long time now, but they really have, like, they've just been this sort of conversation partner. This is turning it really into an actual assistant. It is someone who can help you out with a task and actually try to complete it.

Starting point is 00:53:11 Yeah, without constantly giving it instructions, it'll by trial and error trying to figure out things. And I know this is going to sound like I'm trying to plug Docker everywhere, but this is actually why containers can be useful also, because obviously you don't want these agents doing things that they're not supposed to. So often people put agents in containers because once you have it in a container, you can restrict the set of privileges it has. So you know it's not capable of escaping the container or having access to privileges

Starting point is 00:53:48 that it really shouldn't really have access to it. Right. It's another reason why containers and AI kind of gel together. you don't want it to be deleting files that you don't want it to delete or you know things like that no that makes a lot of sense then so it's sort of this

Starting point is 00:54:10 like Docker being involved in this is kind of just a it's a sort of natural extension of what Docker was already doing it's people want to the ability to run these to run these models and then run these

Starting point is 00:54:26 agents in a safe way and Docker was already providing that so it kind of just makes sense to sort of bring this into Docker as well yeah exactly at the end of day these are yeah it's just a new type of tool right that's the way I see it and

Starting point is 00:54:47 but these tools are super powerful so I think that makes encapsulation even more important in containers and boxes and there's also other side reasons like Docker were really good at transporting huge files which you kind of need to do when working with AI also so yeah the technology is kind of fit in multiple ways so you've mentioned these huge files a couple of times how big are the model files that people are generally running locally obviously again is different models of

Starting point is 00:55:19 different sizes but you know you've seen the ones that people are mainly downloading Yeah, it really varies, it really varies on what kind of hardware people have and what kind of bandwidth or whatnot. There's some really small models and small models are useful actually because you get that low latency because they don't have it. That doorbell example is a good example of where you'd want a smaller model. So models can be as small as 100 megabytes, but I mean the super intelligent models can be. as large as 100 gigabytes. That's not even that much. Like, that's less than I would have thought.

Starting point is 00:56:01 Yeah, yeah, yeah. I guess they don't go into the terabytes yet, because when you load these models, you typically have to load them into kind of V-Ram of sort. Oh, right. Yeah, so we don't really have

Starting point is 00:56:16 too many GPUs with a terabyte V-Ram yet. I'm sure that's coming, though, you know? Certainly, yeah. So that's kind of the reason. Yeah, I don't know if you ever seen the graph of Nvidia's revenue. It was like mostly gaming and then it hits about 2020 and then the data center just takes up the entire graph.

Starting point is 00:56:43 Yeah, that's crazy, yeah. And they're going into all sorts of areas now like automotive because AI is getting important there as well. So yeah, it's crazy. they used to be kind of a gaming company and it's like but yeah now they're like

Starting point is 00:56:56 the biggest player in the world it's crazy yeah yeah and as everything else grows like you know if you're the ones selling the shovels in a gold rush

Starting point is 00:57:06 you know you're going to be coming out pretty well pretty well through that yeah exactly yeah that's kind of what we're doing at Docker yeah

Starting point is 00:57:15 we're not selling GPUs we're more kind of yeah selling you shovel so you can use your GPUs so what sort of brought you into working on like working on AI stuff in this space like why are you involved in this that is a good question I'll tell you to show story I the boss I was working a lot in software divine vehicles

Starting point is 00:57:45 kind of like electric vehicles kind of self-driving vehicles and all that thing. And my boss at the time said, you're, you're a somewhat innovative guy. I've seen you come up with cool solutions in the past. You should look into AI. So yeah, I said, why not? I started a site AI project. My initial goal was just to make AI much easier to run on local machines. And then I started working with the Lama CPP community because I found them really welcoming which is yeah shout out to the LAMCPP community one thing I love about LAMCPP community

Starting point is 00:58:29 it's a true community like there's people from like a hundred different companies and individuals working there whereas some other projects they're kind of like oh we're open source but you don't really work for us so we don't care about your contribution sure right yeah so I got involved at LAMS CPP

Starting point is 00:58:45 and started hacking away there but LAMCPP as it said is very technical and I talk container technologies blended well with a lot of that concepts there so yeah that's kind of how it went pretty much

Starting point is 00:59:00 okay okay fair enough that's yeah that was a pretty smooth way to get in there I guess like I don't know I like to always ask people how they get involved and whatever it is they're involved in some people have like crazy 20 year backstories other people it's like yeah it's

Starting point is 00:59:22 like a good idea someone told me i should do it yeah no i'm the kind of engineer i've changed my area of specialty like 10 times so i'm kind of used to just uh i'm gonna try something different and try and get good at this niche and normally that works out for me so how long have you been involved in this Um, well, I've been involved in the open source world for years now, but the AI stuff, probably around a year at this point. Maybe a little longer. Yeah. Not that long. Maybe 14 months. I'm estimated. To be fair, that is a really long time in the AI space with how much things have changed. Yeah. And to be honest, to be honest, like, um, things move so fast is you kind of just got. to keep learning and learning what's relevant at the moment, which is sometimes people find it hilarious when they see a job post for an AI job and it's like, oh, you need 10 years experience in AI, which is kind of silly because like AI 10 years ago is kind of completely different

Starting point is 01:00:32 to what it was now. So yeah, I don't think people should be like intimidated by AI because yeah, it moves so quickly you kind of just you kind of just got to jump in and find what's relevant at that moment in time. In some ways, it's not exactly the same, but it kind of feels like the web development space in the early 2010s, where constantly

Starting point is 01:00:59 people are talking about a new web framework. It seemed like every other week. And if you try to keep up with everything that people are doing, you will have no... Very quickly, you'll just be entirely lost. Like, there's no way to know every single thing that was coming out.

Starting point is 01:01:18 Yeah, yeah. No, AI is exactly like that. There's too many, there's too many tools out there to know everything. So, yeah. But there are some common, like, core things that will stay consistent. A good one is AI inferencing. At the end of the day, you need an engine to run all the various variants of AI. So they're going to continue, right?

Starting point is 01:01:42 but yeah everything around that constantly changes okay um uh I I had all these things written down and we kind of like

Starting point is 01:01:58 as I said earlier we kind of like hit on so many things so quickly I'm trying to work out where do we actually go um um shit uh Yeah, let me think

Starting point is 01:02:13 What could be interesting? Yeah, we went on a lot of tangents. Sometimes that's a good thing, but sometimes you kind of lose your flow as well. Yeah, yeah. I'm sure this is great for the listeners right now. I'm not going to cut any of this segment. Yeah, yeah.

Starting point is 01:02:40 What that's kind of... Somehow we managed to compress the entire AI space down until about an hour. I don't know how we've managed it. How long do you normally go on this? The hour's usually a short one. Usually go for two, but anywhere in between, there's totally...

Starting point is 01:03:02 Whatever ends up happening is totally fine. Yeah. I could go into things like, Rag and all that, but I think we touched on them already. Well, I'm kind of curious more about this. So, Rag you mentioned is, like, using documents to supplement the AI, yes? Yeah, so, yeah, let's go into that, because that's kind of an interesting thing. When a new model comes out, they put a lot of work into curating that model.

Starting point is 01:03:35 Like, they have to throw that GPUs for, like, weeks or whatever in some cases to train the model. But the world is always changing, right? So sometimes by the time you release the model. What is going on? Oh my God. Jesus Christ,

Starting point is 01:03:54 Discord. Okay, I need to deal with this Discord, man. What the hell? Oh, my God. Okay, okay. I'm not even disconnecting.

Starting point is 01:04:16 It's actually just crashing. Discord was a mistake. Why am I doing this? It's never been this bad. Yeah, I've, yeah. I want to use Discord from time to time, and I've always wanted super aloeba. I've used it for like 300 episodes.

Starting point is 01:04:40 And I've never had it crashed this much. I don't know what is going on. Um, anyway, we were saying rag. That's, that's, that's going into, yeah. Yeah, going into rag, sometimes by the time you use the model, the, the knowledge is a bit stale. Or sometimes, if you're using it in a local setup, you might be using smaller or medium models. So they don't, they wouldn't have like the knowledge of everything. So Rag is like you can throw a PDF at a model.

Starting point is 01:05:17 Let's say it's a book or some kind of document or I don't know, a CV or whatever. And it basically on the spot kind of inherits all the knowledge from that document you gave it. So you could say like, I don't know. This is a silly example, but you could say, what does Harry Potter say on page 42 of Chamber Secrets? And it would know. It's that kind of thing, you know. Which is quite useful because actually creating a model requires a crazy amount of GPUs. So if you can just, like, throw out a few documents to give it more knowledge, that's quite useful, yeah.

Starting point is 01:05:55 So you mentioned using, as an example, your code base as RAG. So you can sort of, I guess, give it an understanding of... how you tend to structure your code? Is that sort of the stuff you'd be using it for? Like what libraries you're using in the code base? I'm just trying to work out how you would use this. Yeah, this is another topic of AI. Obviously, AI is a bit specialized, like, can get a bit specialized.

Starting point is 01:06:30 If you use an AI for code, sometimes people like to use very specific models that are suited for coding. which I recommend you do sometimes because that can really change how the AI models behave. A good example of how people use Docker Model Runner for coding, sometimes they'll use something called Quinn Coder. That's one of the popular ones. And that's specifically tuned to understand code. And they will already have knowledge

Starting point is 01:06:58 of some of the popular libraries in the world. But yeah, people tend to hook that up to IDEs like Visual Studio code and say, hey, the opening eye compatible servers here. And then it'll traverse through all the code in your Git repository and learn about that. External libraries are another one. There are tools that will go as far as find external libraries

Starting point is 01:07:28 on the internet to learn more about those, which is another thing you see happening as well. There's another form of rag, and I can't remember what it's called. But it basically extends AI with, like, the knowledge of the internet. So it'll kind of like use search engines, etc. Right. Further enhance its knowledge. Typical examples of this are like Gemini from Google and Grock from X, they're called now.

Starting point is 01:08:03 You'll see not only are they being processed. to be in the LLM, they're also querying around the internet to try and get more up-to-date data and say the latest news story or whatever, the current stock price of a stock on the stock market or whatever it may be. Right, stuff like that where even if you have the information in the model, it wouldn't even be useful anyway, right? Like if you know the stock price from three weeks ago and you want to know the current stock price, it's just like, it couldn't even achieve that task even if you wanted to without supplementing that data

Starting point is 01:08:38 Yeah, I know exactly And you can update models by retraining a model But like I mean retrain the model is a huge endeavor Right, and for some data points It's just not viable Like current stock price, right?

Starting point is 01:08:53 You're not going to retrain it every second Yeah, exactly It's just, yeah, it's impossible So Okay, so basically the models are trained in a certain way and handle data in a certain way and then you can bring this additional data in

Starting point is 01:09:13 and the model with how it's already trained can pause that data and understand I'm probably not using the correct terms here but it can yeah so you can sort of throw the data more data at the model and it can fill in the gaps where the model

Starting point is 01:09:35 is not really going to be able to provide you with the best results that you'd want to find or if it's very specific stuff, you can sort of guide it in a certain direction. Yeah, exactly. That's well described. Another interesting thing you see people doing, which is kind of related to the specialized models

Starting point is 01:09:59 that I brought up earlier. We brought up the concept of agents we're actually giving AI inferencing engines access to tools. This is where you see specialised models being used as well. For example, one agent might be responsible for writing code and might be using quincoder. And then when it's done its job, it might tell another agent, now it's your job. So that might be running another model that's more tuned to run its specific task. And then that might give a task to the next agent, which is running a different model,

Starting point is 01:10:42 which is specialized for that task. So that's kind of why agents kind of gets exciting. So agents are relatively new thing? I'm sure people have thought about the idea of having AI go off and do its own thing for a very long time. but a new thing to the point where it's actually viable to achieve things. Yeah, I think as people are remote... Agents have been around, they've been around... The concept of agents has been around for at least a year,

Starting point is 01:11:18 but it's only kind of like the last few months they start to get really powerful. I suppose people are spinning up new infrastructure and software frameworks to actually do it properly. So, yeah, it's very topical. at the month. It's just a natural kind of evolution, I guess. And I think what's really interesting with any of these tools that come out is, like we're kind of saying earlier, there is this idea of how it's going to be initially used, and then as you let people play around with them, you realize how people are really going to sort of make use of them. I like to go with the example of, you know,

Starting point is 01:11:57 You look at what people are using SORA for, for example, and, you know, initially it's this, like, it's like a certain kind of, like, you know, movie style, hey, look at these interesting clips. And it's very quickly turned in, I've noticed what people are using it now for is, like, funniest home video style stuff, where it's like, hey, my dad fell asleep and a bear climbed on top and it's, like, sniffy, like, just dumb things like that. and especially when you throw it at social media and you just see what people are going to what ideas people have with these tools and what they can really do with them and how quickly it changes how people are using them. I think it's just fascinating to watch what people are doing

Starting point is 01:12:41 when you hand them something like this, something so powerful like this and what they're going to even try to do with it. Yeah, no, it's very true. I know you're on X.com, I still call it Twitter. I think Twitter is funny now because the gospel fact-checking tool on Twitter now is like, At Grok, is this true? You see that all the time on books.

Starting point is 01:13:08 Oh, I saw a post earlier. At Groch is this AI generated? Yeah, it's kind of funny. Yeah, yeah. No, I get it. it, though. Like, especially with some of the, some of what I'm seeing out there, it is legitimately hard to tell in some cases if something is. Like, obviously, you know, you have your obvious ones. Yeah, you see a bear riding a motorbike. Yeah, sure, obviously that's not

Starting point is 01:13:42 real. But, you know, there's this, some stuff out there now where it's very close to the line. And if you don't know what you're looking for, you're not going to spot it. yeah this brings up a good point you definitely do have to worry about kind of ethics sometimes in AI it's very important like I spoke about

Starting point is 01:14:08 the LAMCP community which I work with quite a bit there's this account that appeared in the last week and it's leaving comments on all sorts of pull requests, code review comments,

Starting point is 01:14:27 but it's not a, it doesn't call itself a bot account. It's a human GitHub account, but it's clearly powered by an AI bot. And we tag the person, guy, bot, we don't know what it is,

Starting point is 01:14:45 and say, hey, you're kind of polluting our pull requests with all sorts of crazy comments. That's something. sounds make sense, but 90% don't. And he was like, oh, sorry, I'm, I'm doing this for my research project. I'm doing a PhD or whatever. And we kind of said, yeah, well, we might have to revisit how we do this or whatever.

Starting point is 01:15:09 And he kept responding. And now we actually think even that response might have been a bot and not a real human. So, yeah, there is a, there is. There is a lot, like, we're, we're currently talking with the LAMCP maintainers how, how we handle that kind of thing, because, yeah, an AI bot should at least identify itself as an AI bot, we believe, because, you know, transparency reasons. yeah this is like this is a problem that we're seeing sort of all over the internet and I like you know

Starting point is 01:15:48 it's been known for a long time that a lot of social media accounts are powered by you know fake actors whether it be you know you want to get into like government stuff or AI powered stuff whether it's it's not a

Starting point is 01:16:01 it's not an organic account on YouTube you're seeing a lot of these you're seeing a lot of these videos where it's just some generic like I you know I'll see like generic F1 footage and then it's an AI voiceover

Starting point is 01:16:15 generically explaining something that happens in F1 and there's a lot of especially as the voices get better and sound more natural it's it's harder to tell if it's actually organic content or if it's something that is just

Starting point is 01:16:31 being just generated out and this is this is like harmless stuff but you know you get into the ethics areas where people are actually using it, like I was mentioning earlier, generating what looks like real CCTV footage, right? Like that, that gets into a really dangerous area there. Yeah, yeah. And yeah, you're 100% right. It's definitely something I've noticed in recent

Starting point is 01:16:59 weeks. Yeah, like in the past, it was very obvious to notice. That's definitely an AI bot. Now it's, now it's starting to get unclear, which, yeah, it'd be. interesting to see how we handle that. But I will say from the software companies I've worked with, there's a lot of workshops and governance being formed around

Starting point is 01:17:23 that to make sure we're responsible with our usage of it. Yeah, it'll be interesting to see where it goes. Yeah, no, I at the end of there, right, people, if you give people a tool, they're going to use it, right? If you give someone a hammer,

Starting point is 01:17:41 someone's going to hit someone in the head with it, right? Like, people are going to misuse a tool no matter what you, what guard rails you try to put into it. Like, yeah. Yeah, this is a really tough problem. I do agree that AI content should be labeled.

Starting point is 01:17:58 The problem is when people just don't do it. Yeah, yeah. No, exactly. I'm, I'm, if, if there's one thing I picked up this week, it's like, yeah, people should definitely AI label their stuff. I think that's really important from an ethics perspective. Well, funny enough, this isn't really,

Starting point is 01:18:17 at least in the open source world, this isn't exactly a new problem, in my opinion. For example, let's just take any random Linux distributions. You hope people use Linux OSS for good things, but I'm sure they're used for horrible things also. It's the point you make tools. You can't always fully control how people use it, but that doesn't mean

Starting point is 01:18:42 we should shy away from our responsibilities either so it's... No, I understand that. I understand why there's like, you know, you sometimes hear these talks of putting the brakes on AI

Starting point is 01:18:56 and dealing with the ethics issues and then continuing forward. And I, like, I understand the arguments against that because, you know, if, if, you know, the US says we're going to, we're going to put the brakes on this, we're going to discuss ethics,

Starting point is 01:19:10 you still have other places that are going to be doing stuff and especially when you're dealing with open source tech like borders don't exist with open source so these are conversations that need to be had but you can't it's very difficult to slow things down yeah no the only yeah I think the only solution is to these things

Starting point is 01:19:37 in parallel maybe you can try and slow certain things down with regulation and I think sometimes that's a good idea but as you said yeah that's I feel like I should talk to my lawyer now or something but no you're right in the open source world you can't ever like fully control it so I think the smartest thing to do is yeah try and deal with the ethics issues etc in parallel as as best as you possibly can because it's super important. Yeah.

Starting point is 01:20:14 I don't know. I am apprehensively excited for the future, I think, is the best way to put it. Yeah, I mean, too.

Starting point is 01:20:25 I think so many fields are changing, so many, there's been a lot of discussion about how so many fields are going to be eliminated. I don't know what, like, anytime any big new tech comes along,

Starting point is 01:20:37 entire new fields are generated and I don't think anyone's really really knows like you know 10 years so now right like what is this going to really enable that wasn't possible before what new fields of endeavor

Starting point is 01:20:54 are going to exist because this tech was further you know further researched yeah I agree and I don't think it's going to be one thing in particular

Starting point is 01:21:09 I think it's going to sprawl out to a wide or a of use case as if I'm honest. Yeah, because even I brought this up earlier in the podcast. Do you call this a podcast or YouTube? Yeah, yeah. Podcast works.

Starting point is 01:21:30 Oh my God, Discord. How? Why? Why is it so bad today? Oh my lord! What the hell? This is the technical of technical issues episode. Why the hell?

Starting point is 01:21:55 Oh my god. why why is this code like this well you're setting that up I have a friend actually I think it's really interesting I was talking about me speaking I didn't stop the reporting by the way so in case just in case you're going to say

Starting point is 01:22:27 them you don't want on the reporting no that's fine okay okay I've a friend this is like this is like another example of interesting news cases. She's learning Spanish, but like she doesn't have any Spanish

Starting point is 01:22:43 speaking friends to practice and speak with. So she has actually been literally with a microphone and speakers been speaking to an AI bot to improve her conversational skills in Spanish because she doesn't know

Starting point is 01:23:01 Spanish. Which I find very interesting. the the duolingo application has been adopting a lot of this with their learning modules as well it's from what I can tell

Starting point is 01:23:18 very language dependent on whether or not it's going to be currently good and obviously the language that were coming from right so if your if your source language is English I imagine German would probably work better than Hindi or Mandarin for example

Starting point is 01:23:38 where the language is very structurally different but a language like German is there's a lot of similarities between the languages so I'm sure that the translation between the two would be vastly easier I'm not a I'm not a linguist in case anyone couldn't tell but that would be my assumption I actually do really like AI as a translation tool

Starting point is 01:24:01 because let's say you use like the non-AI translation. I'm sure they're getting AI features now, but let's say in the past used Google Translate it would give you the translation literal translation but sometimes you wouldn't

Starting point is 01:24:17 understand why the sentence is formed that way or words that have a double meaning of so you can literally ask the AI why'd you do it that way? That doesn't make sense to me and it can explain it So I think it can be much more powerful for translation from that perspective, you know.

Starting point is 01:24:35 Actually, that's a really good point. I'd never thought of that. Yeah. Because you know, like, translation tools to give you literal translation and you're kind of not really sure the nuance or anything around. Yeah, especially in languages where there's nuance that is lost in the translation, right? Like, you know, languages that have. gendered nouns for example

Starting point is 01:25:04 right like that's just not a thing that exists in English so you kind of lose out on on that additional context or where the way that words are said is different for males and things like some languages have like a distinction between like

Starting point is 01:25:24 the gender of the speaker so like what like which sorry I said yeah like most of the Latin ones and if you're like a direct translation in English you just don't have that but if you have the ability to ask

Starting point is 01:25:42 why it's translated like that and sort of any additional nuance that is lost I just never really thought of that as a as a use for I guess that does make a lot of sense though yeah hmm no I found it useful because I'm learning no languages

Starting point is 01:25:57 and like Google Transit Yeah, I feel like I never really fully understand why it's translating in a certain way. But when I do it, I, yeah, I find it a lot more kind of informative. And I'm like, oh, that makes sense. And oh, that's not quite the way you'd say it's in English. Oh, that's, we don't really have a word for that in English. I'm glad I've learned this new kind of thing that we don't really have the concept for in English. because a traditional transit tool

Starting point is 01:26:30 wouldn't really do that for me. Hmm. Hmm. Okay. I will I'll definitely have to keep that in mind. That's, yeah, I'll, yeah, I'll definitely keep that in mind. This is, again, this is, this is one of these things where it probably wasn't initially thought of that this would be useful for this. Like, this would be a useful endeavor.

Starting point is 01:27:01 This is a thing that people would really want. But as it's further developed and as people use it, you sort of uncover new use cases that weren't entirely obvious. Yeah, no, 100%. This is kind of tangential, but Google Meat was kind of bringing in things like this. I think now on Google Meet, if I'm speaking to you and you are speaking

Starting point is 01:27:30 let's just say French for example and I don't know French you can like turn on real time subtitles so there might be English subtitles under your French so two people could actually kind of speak to each other

Starting point is 01:27:46 even though they don't speak the same language I did not know they did that I think Google me released it but in very limited, like only one or two languages for it? As of three yeah, as of three months

Starting point is 01:28:03 ago, there's like some articles about it. Why were more people not talking about this? Okay. That's really powerful because two people don't speak the same language can speak. Yeah, I know they were doing

Starting point is 01:28:18 they were doing something like that with the pixel phones. which I think like that that makes more sense as an initial use because how often are you going to be in a meeting with someone you don't speak the language of but if you're in person and you're like doing a live translation you're like at a restaurant

Starting point is 01:28:42 you're not really sure how to order something for example like that as a first usage makes a lot more sense but it makes sense you would also repurpose that for their other communication platforms they have as well Yeah And another thing I find very useful

Starting point is 01:29:02 Almost all the video tools Or Zoom has it and Google has it Is they have like AI not ticking So if you turn it on At the start of the meeting It just like sends everyone an email With the Google Doc

Starting point is 01:29:15 At the end of the meeting This is what we're discussed These are the action and items these are the people who agreed to take on these action and items. And that's actually scarily accurate. And you don't have to touch the keyboard. I wish I had that when I was at university. I wish I could have had that listening to a lecture.

Starting point is 01:29:37 That would have been so nice. Yeah, it's pretty cool. I use the Google Meet version. It's quite good. But I know there's other versions for Zoom, et cetera, also that are also pretty much. They're just as good, you know. I was, um, I graduated university sort of like just before the AI stuff really kicked off.

Starting point is 01:30:02 So I kind of like missed out on all of that. But I do know people who are, who are teachers today, who are kind of like, there's a real struggle with how to, how to integrate this tooling, because kids are going to be using it. No matter what you try, they are going to use the tool. So it's a matter of how to... How you're going to integrate it. And, like, you know, it's... This isn't a tool that's going away.

Starting point is 01:30:36 So it's also... Like, it feels like a disservice if the schools aren't trying to educate people on how to use them. This is another one of those areas where I really don't have an answer on how you would... how you would properly fit this in and it seems like everyone's kind of

Starting point is 01:30:57 trying different things and nobody really knows exactly what's going to work out in the long run yeah I've talked about this topic with certain people

Starting point is 01:31:12 there's a couple of schools of thought here some people think it's great because it accelerates your learning rag is a good example you can throw a lot of notes and rather than searching through all the pages or doing manual searches of the document to find certain part of your notes you can you know you can just write a prompt

Starting point is 01:31:35 how would i do where is what is and you get the response instantly so people in that school i thought think it's great it's just accelerating or learning i suppose the other problem people a kind of more negative school and I think both are somewhat at school of thought is it's kind of killing people's kind of creativity and ability to think on their own because

Starting point is 01:31:59 some people just are just getting answers from AI and just using that as is and they're not really learning anything as a former enduyer of Spark Notes whenever I had to do a book review I can say for a fact that if there is a way that I can

Starting point is 01:32:21 do less work on a book review or other sort of assignments that I don't want to be doing, I would have done so and I'm almost certain people are going to... Look, if you're a 15-year-old, right, do you want to write a book review? No, you're going to find a way to get around it. And I actually, funnily enough,

Starting point is 01:32:44 I think you should use it to accelerate your work. But I think there's a little bit of personal responsibility there. Like, if you are using it to accelerate your work, at least make sure you're ingesting information and, you know, you're actually learning because at the end of the day, there's going to be an exam at the end of the semester. And if you haven't actually been learning, you're going to fail, right? So I think some of that.

Starting point is 01:33:14 a lot of that is personal responsibility as well you just have to be but i i i do think people i actually i'm a firm believer you should be using it to accelerate you should be using it to accelerate your workload to an extent because um it's like i remember my maths teacher in school saying i'm i'm pretty good at arithmetic but i i remember him saying to students in the past who's like who would overuse the calculator or maths class and he And he said, I'm 35, right? So there wasn't really, when we were very young, there wasn't widespread, very widespread usage of phones.

Starting point is 01:33:54 So he was like, you're not going to have a calculator in their pocket always when you're older. And I'm like, we kind of do. You know, so, yeah. I do agree with, like, the personal accountability. But at the same time, right, like how much personal accountability is, like, a nine-year-old. going to have, right? And at that age, you know,

Starting point is 01:34:20 there's also the argument, like, if you are going to introduce it in schooling, at what point do you? Do you save it for sort of the latter half of the education? You're like high school time? Do you bring it in early on? Do you, like, what effect is it going to have for someone from the very

Starting point is 01:34:38 start of schooling to have an assistant like this? Is it going to be beneficial? Does it, like, stunt learning because of how new this is no one has any long-term knowledge of what's like what happens if someone goes throughout their entire schooling

Starting point is 01:34:55 and has access to this tooling like what like no one no one knows yet yeah yeah no one truly knows I'm sure there'll be plenty of studies done with it yeah over the next decade or so these studies are almost certainly already

Starting point is 01:35:13 being started now. Yeah, I'm sure. Yeah, it's a good point. There probably are some situations, plenty of situations where you do have to limit AI usage. A final exam is definitely more. Yeah, yeah, sure.

Starting point is 01:35:34 Yeah, but once you move into, you know, once you're in higher education, right, like if you're paying for your university, if you're going through that and you're trying to skate through and not learn anything. Look, I don't know what, like, you're just wasting your money at that point. I don't know, I don't know what you're doing. I know, you know, the, like, the grading system in university here,

Starting point is 01:35:59 like a passing grade is a P, so there's this sort of mentality P's get degrees. You have these people who just skate through on the lowest possible grade they can get to pass and you're like you always have that right like it's not a new thing but you know you're going to see more of you're going to see more of that where there's people who i kind of just there because i don't know why they're i don't know what they're doing there they just you're always going to have the the you know whenever you have a bell curve you always have the lower end of the bell curve right like there's always going to be people that just don't put in the effort yeah yeah but at the same time

Starting point is 01:36:42 I think people who are willing to use this tooling effectively and augment their work and sort of speed themselves up. I think for those people, we are seeing, you know, really powerful benefits come along with it. I saw an example from the author of Curle a while back where he's sort of, he's taking a lot of issue with people submitting garbage security reports to kill because they run a bug bounty and, you know, when there's a bug, Bug Bounty, people are going to try to get the bug bounty. Yeah. He recently had someone outline, I think it was like 23 or something, some like large number of security errors in Kirl. And this person had sort of augmented their work using AI.

Starting point is 01:37:29 And these were all legitimate issues and that probably would not have been found, at least within any reasonable amount of time, without going through this route. Yeah, yeah. Yeah, I read some of his post as well. I didn't read it sounds like you read them in more detail than me. I remember he called some of the

Starting point is 01:37:52 reports AI slap like it was Yeah, a lot of the reports he's not, like I get it right, like if you run a bug bounty, people are really going to try to do anything possible to get that money and I can totally see why that would be

Starting point is 01:38:08 really, really annoying to deal with yeah yeah yeah this goes back to the code review bots um wouldn't in projects i work on when we have a code review bot it's not actually a big deal because if we deem it not important we can just ignore it i do feel bad for the curl maintainer daniel because if people are you know pushing a cv yeah he's almost mandated to to look at it and treat seriously because it's security issue so yeah

Starting point is 01:38:46 that does suck for him yeah yeah it's not just curl curl's just a really a really loud example but maces had a problem with this recently and a bunch of projects you know I get why people

Starting point is 01:39:02 want to contribute to open source but you still need to have an engineering background like oh my god Discord disconnected again Jesus Christ Oh my lord I I need to check what this is

Starting point is 01:39:20 man this is so annoying I do apologize for all the technical issues today like oh that's fine For all we know, my end could be doing it.

Starting point is 01:39:42 I don't even know. No, my, my, my Discord's actually just, my entire Discord client is crashing. So, yeah, I'll just do a reboot after this, and it's never going to happen again. This is going to be one of those, one of those issues where it's just ephemeral, and I don't, no, nothing identifies what the problem is. Yeah. What did you last hear me say? The last thing I remember is we were talking about Co.

Starting point is 01:40:19 Oh, I was saying projects like Mesa have issues with this and a bunch of other projects have issue with this as well. Like, at the end of the day, right, if you're going to be contributing to a project, you do still need an engineering background to really understand the code. and it seems like a lot of people and I get why these tools are really powerful

Starting point is 01:40:42 a lot of people are sort of thinking they can just ignore that step and thinking they can just rely on the tool to do everything for them where it's not just an assistant it's a replacement and there's also a lot of companies who sort of feel this way as well

Starting point is 01:40:59 and you know you're seeing hey how much do we need all these people can we get rid of them replace them entirely with these tools and I really don't know in the long term what that's going to look like if that's going to work

Starting point is 01:41:13 a lot of people are banking on it working though yeah yeah that's why we rough this up lots of times already that's why I would call AI an accelerator because for most for most use cases

Starting point is 01:41:30 of AI today you do kind of need a human to say well this this is AI crap, or, and, or, you know, that's, that's pessimistic. There's a lot of cases where 90% of what the AI does is really good, so, yeah, no, there's very few cases where it outright replaces humans. It's more an accelerator tool yet. Yeah.

Starting point is 01:41:55 Um, is there anything else you wanted to touch on, or is that, like, we've kind of, like, just gone, again, random, down to some random tangents. No, there's nothing springs to mind. As I said, my main goal here was actually to spread awareness of Docker model runner because I think it's a really cool tool that makes local AI easier to use and abstracts the way the complexity of engines like Lama CPP and gives you a way to push and pull models around the place so you can deploy your AI models and many different kinds of hardware.

Starting point is 01:42:39 And that was my take. That was the main reason I came on because I'm really trying to grow that community. So if people are interested in like learning about AI enhancing AI's capabilities, please contribute, Star Fork. Yeah, so if people do want to get involved, where can they go to? Is there some sort of place they can discuss stuff? Is there just the GitHub? What is there? the main central link

Starting point is 01:43:06 and we're cleaning up to read me as I speak now to make it super clear the main place is github.com slash docker slash model runner also another place I would recommend people go you'll see in that read me

Starting point is 01:43:21 there is a link to a slack if you just want to casually chat with people like me to learn more and get yourself comfortable So they're kind of the places to go, really. So what are you sort of, what are you sort of hoping to get out of, I guess, building a community?

Starting point is 01:43:42 Like, obviously, you know, having all people contribute and understand what the code is, but, like, what are some general goals you have? Um, first of all, I don't think there's kind of a true. First of all, I don't think a tool like this really exists, and I think it's quite useful. So I'm just trying to spread awareness of that, because anyone I hear that uses Docker model runner, they're like, whoa, this stuff is really cool. How come I never knew about it?

Starting point is 01:44:17 I didn't even know I could run models locally and push and pull them around the place. So there's that side of it. Obviously, I work for Docker, so it's increased usage of Docker tooling, which is the obvious thing. And the other thing is, yeah, I think what truly makes, you know, great open source projects is contributors coming in with a new perspective and enhancing a project for new use cases. And I really want to see that because, yeah, I want to see what people find it useful for.

Starting point is 01:44:58 and I wanted to grow and become, you know, a great open source project. Fair enough. Yeah. Is there anything you would like to direct people to besides that? Anything you want to mention? Anything you want to highlight? No, not really. Just if people are interested in AI and they want to go down a lower level again,

Starting point is 01:45:28 You brought up Mesa, just to people who are curious about AI. I kind of see Lama CPP as the Mesa of AI, as I kind of said before. So if you're interested in really Docker model runners, it meant all high-level go-lang, so it's really kind of like contributor-friendly. But if people are interested and really getting down to the nuts and balls, I recommend looking at the LAMCP community also because that's kind of where that stuff happens.

Starting point is 01:46:00 And I'm also pretty active there. So, yeah, we don't really have a slack or anything like that in the Lama CPP, but we're pretty active with our conversations via GitHub issues and pull requests and discussions and things like this. Okay, fair enough. Apologies for all the technical issues.

Starting point is 01:46:25 I don't know what's going on with Discord today. um yeah uh if you want to come back on and talk about more stuff hopefully uh discord's either working or we just you know use google meat like your original link had anyway no i'd love to and yeah feel free to reach out i've watched a couple of your youtube videos uh already i watched a couple of you and matt miller from the fedora community and what that so yeah i i was really happy to

Starting point is 01:46:58 I'm on this podcast, actually, because it's my favorite open source on. Oh, that's cool. No, Matt was definitely a lot of fun to talk to. Yeah, I, I, there's a lot of really cool people in the open source space that I feel like people don't, you know, you might see blog posts from, but you don't really hear them speak about what they're working on. And I don't know, I like to, I like to bring people on to actually talk about what they're involved in because, you know, a lot of people have a lot of interesting things to say.

Starting point is 01:47:28 Yeah, I get it. Well, thanks for having me. Yeah, no, it's a pleasure. Nothing else you want to mention before I do my outro? No, that's it. Okay, cool. You guys can't see me, but whatever. It is what it is.

Starting point is 01:47:49 My main channel is Brody Opson. I do Linux videos there six-ish days a week. Sometimes I stream there as well. I've got a gaming channel, Broody on games. I streamed twice a week there. Right now we're playing through Yakuza 6 and Holo Night's Silk Song. If you're watching the video version of this, you can find the audio version on basically every podcast platform.

Starting point is 01:48:10 Search Tech Over T, and you will find it. If you'd like to find the video version, it is on YouTube, Tech Over T. I'll give you the final word. How do you want to sign us off? I never tell people they're doing this, so it's always fun to see what they do. Exactly. I'm not a runner. Perfect.

Starting point is 01:48:30 I don't know.

Tech Over Tea - Docker Model Runner Is Awesome | Eric Curtin

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.