Tech Over Tea - Docker Model Runner Is Awesome | Eric Curtin
Episode Date: November 7, 2025Today we have Eric Curtin from the Docker Model Runner project on the show the Docker projects attempt at bringing the ease of Docker to running AI models on your local hardware or deployed as a servi...ce.==========Support The Channel==========► Patreon: https://www.patreon.com/brodierobertson► Paypal: https://www.paypal.me/BrodieRobertsonVideo► Amazon USA: https://amzn.to/3d5gykF► Other Methods: https://cointr.ee/brodierobertson==========Guest Links==========Docker Site: https://www.docker.com/Announcement: https://www.docker.com/blog/introducing-docker-model-runner/Docs: https://docs.docker.com/ai/model-runner/==========Support The Show==========► Patreon: https://www.patreon.com/brodierobertson► Paypal: https://www.paypal.me/BrodieRobertsonVideo► Amazon USA: https://amzn.to/3d5gykF► Other Methods: https://cointr.ee/brodierobertson=========Video Platforms==========🎥 YouTube: https://www.youtube.com/channel/UCBq5p-xOla8xhnrbhu8AIAg=========Audio Release=========🎵 RSS: https://anchor.fm/s/149fd51c/podcast/rss🎵 Apple Podcast:https://podcasts.apple.com/us/podcast/tech-over-tea/id1501727953🎵 Spotify: https://open.spotify.com/show/3IfFpfzlLo7OPsEnl4gbdM🎵 Google Podcast: https://www.google.com/podcasts?feed=aHR0cHM6Ly9hbmNob3IuZm0vcy8xNDlmZDUxYy9wb2RjYXN0L3Jzcw==🎵 Anchor: https://anchor.fm/tech-over-tea==========Social Media==========🎤 Discord:https://discord.gg/PkMRVn9🐦 Twitter: https://twitter.com/TechOverTeaShow📷 Instagram: https://www.instagram.com/techovertea/🌐 Mastodon:https://mastodon.social/web/accounts/1093345==========Credits==========🎨 Channel Art:All my art has was created by Supercozmanhttps://twitter.com/Supercozmanhttps://www.instagram.com/supercozman_draws/DISCLOSURE: Wherever possible I use referral links, which means if you click one of the links in this video or description and make a purchase we may receive a small commission or other compensation.
Transcript
Discussion (0)
Good morning, good day, and good evening.
I'm as always your host, Brodie Robertson.
And this is attempt number two at getting this show to work
because we had some, we'll just say technical difficulties before.
So, we only lost a minute of the episode.
We'll just redo the introduction.
Who are you? Tell the people what you do.
Yeah, no, sounds good.
My name is Eric Curtin, so I work with Docker.
I work on kind of Docker's main AI inferencing tool called Docker Model Runner.
It's like a container centric way to run AI on any kind of machines using container technologies
and OCI registries like Docker Model Runner for transporting AI models and that kind of thing.
And also from just from the AI perspective, I work a little bit with, I work a bit with
with Lama CPP, which is kind of like
the premium
AI inferencing engine
particularly on
lower end hardware like desktop and
edge devices.
Well, let's
sort of start at the core, like
basic level and then work our way up
from there. For anyone who's unaware of what
Docker is and what
function that is serving, how about we just start
there and then build on top of that?
Yeah, sounds good.
Yeah, so Docker is famous for being a container technology.
It's a way of spinning up lightweight sandboxes to run all your pools in some sort of user space environments.
That could be Ubuntu, Fedora, CentoS.
It could not be a distro at all, literally just an application.
But there's also a part, the way we use containers.
The way we use containers is we normally pull this application
or user space, whatever you want to call it,
from some centralized repo.
The most famous one, I guess, is Docker Hub.
But there are many others.
There's like GitHub Container Registry.
There's like Quay.io, and the list goes on and on.
And the other place, those are examples of kind of public
registries. There are also many private OCI container registry, so it wouldn't be uncommon for
some kind of a business to set up these container registries in their internal data centers,
so they don't always have to be pulling everything over the public internet, so they save
some cost there, and there's security advantages, right, because you're not reaching out
to the public internet all the time. Right, and if you're building something that is only locally
important. You wouldn't want to put it, you know, out on the public internet.
Yeah, exactly. Yeah. And it's probably faster to pull in the internal network.
Right, right, right. But this has become appealing to AI people in recent times. And there's a couple of
examples of people doing this. It's not just Docker model runner. These OCI registries are normally
built on very good infrastructure. And they're normally capable of like transporting like,
huge amounts of data, like gigabytes, you know, which tends to be the kind of sizes that
AI models are. So we kind of, everything is a file, right? So we kind of use this format. And rather
than pushing and pulling containers, we actually push and pull models to the exact same
infrastructure. And that's very appealing to people, because at the end of the day, when you're
pulling a model, you just need to pull a couple of gigabytes. So,
I can just reuse their existing infrastructure to do the exact same thing just for AI models.
So when you say model, what is an actual AI model?
What are we looking at as this thing?
Obviously, it's a very complex topic.
The entire episode could just be explaining.
It's very possible to simplify the explanation.
Like, there are different model formats, but let's just for the second argument, we'll take the one used by Lama CDP and Docker model runner called GGUF.
It's very simple.
A model is a file, which contains all the AI's knowledge, how it works, its architecture, everything.
So examples of models will be like Lama from meta.
Google have one called Gemma.
There was a famous one, a famous open source one that got popular.
around a year ago called deep seek r1 and it's very simple it's a file you pull and that's the
brains and knowledge everything of that model in one file and you basically load that into your
inferencing engine and then you have um your kind of chatbot or it could be kind of multimodal
AI which is like feeding the A model images or video or sound that's kind of going on also so
where does docker model runner
sort of fit into the equation here?
What is it actually doing?
Yeah, that's a good question.
It's doing a number of things.
One of the things that
kind of used to be quite difficult
in the early days of AI
was just configuring your hardware set up.
Like, how many layers do I want to send a GPU?
Do I want to use Kuda, Rackham, Vulcan?
there's a whole list of kind of GPU level abstractions you can pick from and configurations you can do.
And that actually gets quite complex and it can be a bit hardware independent or hardware dependent.
So one of the things it does, it kind of abstracts all that complexity away from you.
So you can run open AI compatible servers, which is basically,
the gold standard for AI inferencing.
And then, yeah, you can spin up your local LLMs
to do like chat GPT type things or other things.
The other thing that's kind of appealing
is there's certain registries
that have become quite popular for pulling AI models.
One example would be Hugging Face.
Oh.
Uh, did I, did my Discord crash? What happened here? Discord?
Are we having tech issues on my side now? Okay, um, give me a moment. Give me a moment.
Yeah, okay, my entire Discord crash. Lovely. Discord decided to crash, but we should be, yeah, we're recording again. Okay, so, you're talking about how there is a popular place to get AI models, that being,
a hugging face and that's where I disconnected.
Yeah, so hugging face
is a very cool report. It's like
one of the main ones from pulling
for pulling AI models
like meta and
all the company's interest in AI. That'll be normally
one of the places they will
store models. Even though Docker Hub's starting to get popular as well
but that's kind of a new player.
One of the things about hugging face, as far as I know,
and I could be wrong because I don't work for Hugging Face
but typically the only way to pull from Hugging Face
is over the public internet so one of the things
I at least I think is cool about Docker Model Runner is we
you can pull from Hugging Face but it like immediately stores it
in kind of a container like format so once you get once
you pull it via Docker Model Runner you have freedom then
because once it's in that format, you can push and pull it to kind of like anywhere, any of those source via registries.
Right.
So you sort of, you bring it in and then it's sort of, you're not directly tied into that, that hugging face sort of ecosystem anymore.
Yeah, exactly, exactly.
And you might always want to pull from hugging face, but that's fine.
And that's fine if you do, that you can do that with Docker Model Runner, but we give you the freedom to push and pull to other.
So, Hugging Face is like the main one. Are there other ones that people do tend to get
models from? There are, um, there's another popular place, but I'm sure many of aware. I'm sure
many people are aware of. Like, there's one called Olama. Olama can be a little bit controversial
because one of the things they've done
is GGUF is kind of standard
created by LAMCPP.
And registries like
Hugging Face and Docker Hub
just use vanilla Lama CPP
to be friendly to the community
and so we're all on the same page
and all reading from the same hymph sheet
and you know combining our engineering effort
it's having some sort of standard.
I tend to steer clear from Olama in conversations because Olamo also used GDF, but they've changed
the format of GGUF recently.
So it's only compatible if you use the Olama engine, which isn't kind of very community
friendly, but look, that's the way they decided to do things.
So I typically not
to use it as a great example.
Okay, no, that makes sense.
So there's sort of this, I guess, over time,
was it a standard that was sort of developed naturally,
or was it a standard that was, like, provided?
Like, how did this format that people are using actually come about
if you happen to know the origin?
I do know the origin, because I know the creator.
This standard was one of those.
things, there wasn't like really
a standardization committee
or anything like that. There's a guy, Georgie.
He has a really hard
to pronounce second name.
But
Georgie's a genius. He's the creator
of Lama CPP.
And his initials are
GG.
So
so, yeah,
GGOF, the two first letters
are Gigi.
I'm
sorry.
Sorry, if you happen to see this, I don't know how to pronounce your name either.
Yeah, Georgie Gorgonov, let's call it that.
It's funny because I know the guy.
I should have really asked him how to pronounce his last name.
But yeah, he basically created it for Lama CPP.
And Lama CPP is a pretty amazing project.
It's a bit technical now.
That's, it can be a bit hard to use, even though it's improving.
But it has amazing hardware support.
and that's basically the file format they came up with and it's quite useful actually and it's quite
simple because they just put everything into one file that you download so it just kind of organically
kind of became one of the standards it's not the only one but it's one of them yeah everyone just
kind of agreed it kind of makes sense to have a standard that everyone works with rather than just
all doing these weird different things
than having this compatibility nightmare
where no one can really share things around.
It makes a lot more sense
from an engineering perspective
to just agree on this basic thing
and then try to differentiate
in other ways
so that the core functionality is still there.
Yeah, exactly.
And there are other
competing model formats
which are just as popular.
Like there's another safe tensor form
which is typically used of VLM.
But the reason
what GGUF is really popular
is for Lama CPP usage.
So that's kind of the standard for,
let's say, desktop and edge devices right now.
Okay.
Okay, that makes sense.
So you're mentioning the Lama CPP.
What is that project specifically?
Yeah, so Lama CPP is kind of like an inferencing
engine. So it has an underlying library
called GGML. That's kind of
a very low level
cross to the hardware
AI library.
Let's just call it that. ML is for
machine learning. And LAM
CPP builds on top of that.
And it's kind of focused
on chatbots.
But actually that's
changing. That's kind of no longer true
because now they're adding multimodal support.
which takes in all sorts of other kind of inputs like images or whatever um so yeah that's kind
of the funny thing about lamma cpp like it was initially written to run lamma model from facebook now
it runs every single model but you know projects grow and about um and going back to my point
it is a really technical tool so often a lot of people use it like a library uh docker model
runner would kind of be one of those.
Endo Lama would be one of those, actually,
at least in the past.
And there's a bit of a story there.
But I kind of see
it has kind of like the meza for AI.
It's kind of like this low-level
library that everyone goes
to for AI.
And it also has all those
GPU back-ins integrated like Vulcan
Rokam, Kuda.
There's another one called Moza, OpenCL.
There's a lot.
So that's sort of the
the low-level library that everyone builds on
and then sort of you bring in
the tools that build on top of that
will pick and choose which functionality they
want to provide within their tool
but I guess kind of in a similar way
in the video editing space
to something like an FFMPEG where
like pretty much every Linux video
editor every Linux video
player for example is basically
just a wrapper around FFMPEG
at its call for the most part
yeah no
that's pretty accurate
Mm-hmm. Mm-hmm. Now, obviously, I'm sure there's things that I added on to that, and that's definitely important to talk about here.
So, what, one of the things that you did want to talk about is when Docker Model Runner first came out, there was, well, there was, the state it was in at the time, and I guess a lot of those opinions have kind of held on about what it's like and what is currently present within the project.
So I don't know where you want to take this and what you want to focus on first,
but I guess we can get into that.
Yeah, no, this is one of the other reasons I wanted to speak with you.
We're kind of doing sort of, let's call it, a relaunch of Docker Model Runner at the moment
because when Docker Model Runner was first released,
they were kind of racing to get something out because they had all the ideas
and they kind of just strung it together with paper clips and whatnot, just to get it out there.
So when they initially released it, it was beta.
It was only available on Docker desktop, which is a proprietary tool.
Well, it's kind of half half.
It's about 80% open source code and 20% proprietary.
But anyway, it was beta and it was very limited.
in hardware support it only worked on like apple devices and in and video hardware it didn't
have things like vulcan to support every GPU in the world basically and all thoughts of things
like this so that was a couple of months ago actually became fully open source only like a month
after that but i i don't i think we could have communicated that better but now we're kind of
trying to reboot and relapse the community.
So we went GA like a few weeks ago.
We've been cleaning up the GitHub repo to make it more contributor friendly.
We upstreamed all the patches to LAM a CPP.
We enabled lots more hardware.
Tomorrow, Vulcan support is coming out.
So that's, yeah, Vulcan's really nice because, yeah,
you throw any integrated GPU.
or AMD or whatever
and it works with relatively
little configuration or whatever
so I suppose we're trying to
relaunch with a more open centric attitude
and
you know
try and build a community
and really make it an open source
project that people
feel welcome to contribute their
ideas too I guess
so
so was it just
you said it was sort of just like
held to go to the paper clips initially
It was just they wanted to get something out as quickly as possible.
Was it, what was the sort of environment like at the time
and why was there this effort to rush it out as quickly as possible?
That's a good question.
And I'm actually probably not the best person to answer that
because I wasn't part of the team at the time.
Fair enough.
But what I've gathered from the team is
they felt there was a lot of kind of tools operating in this space.
So they felt like they had to get,
out there quickly and I think
the other thing was like just to
gather some initial feedback so they wanted
to like do the bare minimum
minimum even if
it necessarily didn't have all the features
and it wouldn't fully
ironed out just get feedback
from the community on places like
Redis or whatever and work
from there I think that was the idea
okay okay
so
what is I guess what is the state of it
now if someone wants to
go and install it today and try it out,
what are they going to be
presented with?
So, yeah, there's
two ways of installing it. You can
install Docker desktop,
which, as I said earlier,
that's kind of a product, a proprietary tool,
but it's
free for personal use, so that's one way.
The other way of
installing it,
there's something called Docker Engine,
which is like the command line
version. That's
fully open source.
Sometimes people
call it
Docker Community Edition.
So that's
another popular
way of installing it.
I actually
install it both ways
depending on my
machine.
But yeah,
you're kind of,
you're presented
with a Docker
like
CLI syntax
that people
would be very familiar with.
So like to run,
say, Gemma 3
from Google,
it's literally Docker
Model Runner,
AI slash Gemma 3.
Docker model push, Docker model pull, Docker model inspect,
Docker model lists to list all the AI models you've downloaded into your local storage.
So it's that kind of concept.
So if somebody is familiar with Docker already,
it's basically just everything they already know.
Yeah, exactly.
And also because it runs,
it runs an open AI compatible server.
There are also a lot of UIs that are getting popular.
So you can connect it to things like anything LLM,
OpenWeb UI.
There's all these UIs that you can connect.
And they sometimes give you a lot more functionality
in terms of things like RAG.
Rag is basically kind of like using AI with documents
to supplement its knowledge.
So yeah, that's another way of using it.
and you just connect them together, and that can be quite useful.
So you mentioned the Open AI server there.
I guess that's just the standard because they kind of, you know,
they're like the big name early on.
They kind of got to set the standard because they were the standard at the time.
And there's a lot of tools that were built then sort of targeting what they were doing.
So you kind of have to work around the standard that's effectively developed
there.
Yeah, yeah, this goes back to the, yeah, you're exactly right.
This goes back to the point of, we were talking about the standard for the file format for
models.
Yeah, chat GPT got popular and people were like, I guess that's the standard now.
And that's the way things are kind of going in AI, it's moving so fast and something just
gets popular and people are like, let's just go with that.
Yeah, you are definitely right.
Things are moving very fast.
even, like, I don't have like a super close, I'm not like paying super close attention to things that are going on, but just watching from the outside and seeing, like, I saw an example of, I don't know, it might have been like mid journey or something from two years ago. And it was generating a picture of a dog on a skateboard. And I remember how bad the images were back then. You would see skateboards like clipping into things heads. They would, it was like these times. It was like these.
tiny, tiny images, and where we are now, it is a night and day difference.
And I don't think anybody, I don't think anybody even paying attention to this space
knew how quickly things were going to evolve.
Yeah, the ball is rolling really quickly.
And you even see people like enhancing AI with the use of AI.
So, yeah, it's, it's moving at a, yeah, incredible pace.
And I don't see it slowing down anytime soon.
It might eventually, but I don't see it right now anyway.
Yeah, every time I hear that we're hitting some limit, some things happening.
Like, it's, I don't know.
I don't know where we're going to be.
I don't, I'm not in the, in the camp or I'm going to make any sort of predictions at this point.
I didn't think we're going to be here this quickly.
I don't know where we're going to be six months or a year from now.
We're at the point, I'm sure you've seen the recent, um,
demos with SORA, where people are generating all manner of things, but a lot of, if it's a context where the video would be kind of low quality anyway, like, you know, security footage, or a doorbell webcam, like, I can't tell the difference in a lot of cases. It's gotten, like, it's scarily good.
Yeah. No, I completely agree.
Those are kind of the use cases where Docker model runner can be quite useful, actually.
We call those edge devices.
That's kind of Lama CPP specialty.
Because if you want a super accurate knowledge, the knowledge of like 20 PhD student,
if you want that crazy level of knowledge, you're going to go straight to a data center,
with the crazy, beefy
entity at GPU, and you'll probably use
something like ChatGPT.
I don't think
Docker Model Runner is quite there yet,
but for like
edge devices, like the doorbell case with the webcam
like identifying objects,
we can do that with tools like
Docker Model Runner now, which is kind of
a really cool use case I find.
I quite enjoy it.
So what are people actually using tools like this for?
Because obviously when you're running
a local model, the scale of what you can do with it is considerably less than if you have,
you know, however many GPUs open AI happens to have at this point.
This is a good question. I was only talking about this to the team, to the team recently. What
are people using it for? Sometimes I find that hard to tell. Obviously, we have some
telemetry we can gather via Docker Hub. I think last week we hit a record.
there was like 300,000 model pulls last week
and a lot of them were Lama
Lama 3 from Facebook
for meta, sorry, yeah, I still call him Facebook
I'm stuck in the past, but that's one of the reasons
I'm trying to build a community. We can see lots of usage, but
we don't really always know what people are using it for, to be
honest because there's a lot of Docker model runner it's it's just a tool right it's like
there's previously there so that's one of the reasons I'm on the call we know it's being heavily
used and we're not always quite sure what the usage is but if we can build a community and get
people to contribute features that they care about and open issues about and describe their use
cases that would be very beneficial so my honest answer is I'm not quite sure what about
yourself like what do you use these tools for um i use it for all sorts of things um i i find
the i i code review tools quite useful i turn that on for all my repos i create these days
because the ai code review tools i find they often pick out things that that a human would just miss
so i find that very useful i have used ai to write code sometimes um
but I'll be honest sometimes AI writes code for me and I'm like that's a lot of crap I don't know is it hallucinating or whatnot right but there are other times I was like oh my god that's perfect and I've ended up using 80% of the solution with lit lead edits um I read a lot of blog posts I used to tell print blog posts sometimes um even sometimes email um so yeah even though English is my native language uh sometimes I sometimes I
I find it can be more articulatomy.
Code view I do want to sort of touch on.
I hear obviously a lot of code generation stuff,
but what is it doing on the code review side?
Is it sort of identifying problem areas?
Like, what is it, what is it act?
Like, what's an example of some way you've used it?
Yeah, so like, yeah, everything's just text at the end of the day, right?
So like if you just take a patch, yeah, it, it knows the number of lines added, number of lines removed.
You can't even use things kind of like, I guess we would call this rag to give it knowledge of your whole Git repository.
So, yeah, everything is just a prompt that we call them prompts in the AI world.
So everything is just a prompt at the end of the day.
So the prompt would literally be, like, review this code, you know, critique it, point out any issues with the patch.
And, yeah, it does it, it does a thing, yeah.
How often do you find that actually helps you with what you're trying to do?
In terms of the code review, answer.
Yeah, in terms of the code review side.
I think it helps me daily
One thing I will say about the code review bots
is it often will leave like let's say
there's a thousand lines of code
It will often leave 10 comments
Sometimes they're all pretty much
Us because it misunderstood
Stans the code
Sometimes
50% I'm like
Yeah that's right
I should really
you take that into account and change the code.
And sometimes it's 100%.
And sometimes people laugh at AI code review tools
because they're kind of like, oh my God,
what's stupid feedback?
But for me, that's not the point.
Like, it's pointing out areas that you should consider looking at.
You can ignore the feedback if it's crappy,
but genuinely there's lots of cases that I've seen it point out
pretty critical things that I believe would have been missed by a human.
Hmm. Okay. Um, yeah, this isn't, like, I don't tend to use much in the way of this tool.
And the only, for me personally, the only AI tooling I frequently use is I will generate, um, transcripts my videos and then modify them by hand when it misunderstands me because Australian accent and it thinks I'm saying words weirdly.
Um, but usually it's at the point where like 90 plus percent.
of what it generates is totally fine
and I just need to clean up a couple of
things where it's like oh
you said Gnome did you mean
genome? No, I didn't
mean that actually but some
other times it gets it totally fine
and
yeah like
I can see why people
are gravitating towards these tools
I understand the concerns
that people have with them but I can see
how when you are using them in a
way where it is augmenting the work you're trying to do, how it can actually benefit you?
Yeah, I always view them kind of as accelerators.
Like, they're not really replaced the work, but they can make you get it done a little bit
faster.
And the problem you spoke about there, I won't go into it because that's a detailed topic
in itself, but the hard G and Gnome, which is 100% correct.
there are ways to solve that problem as well
you can do something called training
which we've literally been doing
with Docker Model Runner recently
so you can kind of
coerce the model
and you can basically teach it that
this is wrong this is the more correct way
and that's often done to fine-tune AI models
yeah training is something
I've heard about but like what is the actual
process of training a model
like how how does that sort of go i'm actually not an expert of that i tend to just use the generic
models as it is but a teammate of mine's been working on a quite recently he wrote a blog post on
blocker dot com he was using tools from this cool layout startup called uh unthought with with docker tools
yeah i'm i'm not an expert on that i'll be honest but um you also need need
GPUs for that actually as well um to kind of drill the new knowledge into the AI model
um i didn't really want to go there but uh that's actually one of the cool things about
docker desktop uh which is a proprietary tool if you have a machine that does
has any GPU whatsoever or they're really weak, you can turn on this feature called Dr.
offload. And once you turn it on, it's pretty transparent. But you're actually using remote
GPUs in a data center. And I actually think that's really cool if you don't want to like
invest in a beefy GPU in your house. Now there's I now obviously that's a paid offering,
but I think there's some free usage if you just want to kind of try it out. You get some
many credits or something.
So on the note of GPUs,
I know obviously different model sizes
are going to require different, you know,
different sort of levels of GPU to effectively use.
But I guess what would you really need
to have in your system to effectively use Docker Model Runner
and some of the models that people are commonly running?
this this is a this this is a good question um honestly it's kind of like how long is a piece of string
because i mean you could run docker model runner on like a raspberry pie if you wanted
and run really small models and they wouldn't be the most intelligent um or you could get like
an invidia h 100 which i think costs 30 000 u.s or something
and run some of the most intelligent models in the world.
So it really depends on how much you spend.
It's kind of as simple as that, to be honest.
But like on a MacBook, you can run a decent amount of models with good quality, yeah.
Or on like an 8 gigabyte V RAM into the HP, you can go pretty far.
That's another reason why Docker offload exists, because sometimes it's just not.
work like buying a GPU and if you can just have a few minutes of a power for the
GPU here and there as you need it, that can be very useful.
Mm-hmm. Yeah, GPUs are expensive.
GPs are expensive, especially, like, GPs are expensive enough, and then when you want to do a
lot of GPU compute, then, you know, then you get into, yeah, big money.
But it's not as inaccessible as you think, like, um, you can
still do a lot with consumer
GPUs. I have like an A&D GPU here
it was a couple of hundred dollars
and I have a
MacBook Pro and
you can you can do most things just fine
with that level of hardware.
So all of your like code review
stuff that's just being done
locally or are you using some
online tool for that?
Yeah, when I brought that up actually
I'm not actually using Docker Model Runner
for that typically. I understandable. Yep.
They're like
I was just describing
you could use that from a runner for that
but I personally
don't. The reason I don't is there's
kind of, we use GitHub
for all our repos and there's some
really great plugins that are
free from MotenSource. There's
Gemini Codicist and there's
another one called sorcery.ai
so
that's very like hands off
because you basically turn it on
and it does its thing
as random contributors open
requests.
You could use Docker Model Runner for that if you wanted to be more privacy focused and run
everything locally.
Yeah.
Well, that actually takes us into a good segue there.
Why are people wanting to run models locally when you can get so much extra power from,
you know, relying on these big GPU farms that are available out there?
yeah so there's a couple of reasons for that like one is cost is sometimes it's just cheaper
to to run things locally because obviously when you're running from a cloud vendor they're
making a profit right so sometimes it gets to a point where it's just cheaper to do it yourself
and especially if you're running um like if you're not running the huge models and you're
running medium to low end models um you want you want to run them locally the other
one is latency. You brought up the, you brought up a great example there, and I love it because
I brought it up in so many talks of, of like the doorbell with the webcam. That's the kind of thing
where you really want local AI, because you're dealing with a stream of very beefy data from a video
camera, and you don't want to be sending that back to a centralized server waiting for the
response, that is the exact kind of use case where you want local AI because you want the
latency to be super low. As if you were gaming, right, you want to, you want to late as low as
possible. The other one is privacy. You know, sometimes you're using some like AI tool
in the in in the cloud and you're obviously giving it a lot of data. You're giving it a lot of
sensitive data about your work or whatever and you know, some providers are very
open we will use your data and some say oh we're not using your data at all but can you trust them
really we don't know right but if if you're running it locally and you know you've you've everything
containerized etc you can you know you can you can you can relieve a lot of those concerns
well i know that um you know if you're operating a government context you can't say
data out in certain ways or medical stuff like you there's like laws around how that data can
actually be handled so if you're operating in a context like that being out of run your own
your own models separate from the big ones makes that an actual viable thing that you can
use then yeah exactly yeah um like i know companies um that have turned and i'm not trying
I'm not trying to bathe companies, but anyway.
So do you know what?
I won't name any company in particular,
but I've known companies internally that they've turned on enterprise,
chat GPT like things.
And laterly, a month later, they've turned them off because they're like,
we think we might be leaking info here to a company we don't want to be leaking to.
So, yeah, maybe that's paranoia.
Maybe it's not, you know.
You can debate about that all day.
I don't know.
no i can definitely like especially if there is big money attached to it like i can i can totally
understand why you would be paranoid and you don't want to risk any possible chance that this data
could be leaked in any way if there's you know fines attached to it if it's leaked things like that
like it totally understand why you want to be very careful with certain kinds of data you're
handling yeah yeah it's it's it's only natural right uh at the end of day with AI models
kind of like big massive databases with like huge amounts of data that can be used in some
very powerful ways and sometimes you don't want to enhance that further with your own
intellectual property or trade secrets or whatever it may be right right no i totally get that
that makes a lot of sense okay um where do we go from here
I don't even know.
So, I guess we can...
Where do we go from here?
I actually don't...
I don't know.
We've kind of just been, like,
jumping all over the place
throughout the entire episode so far.
So, actually,
I want to jump back a bit.
You said earlier that you want to do...
You guys at Docker,
I want to do, like, a proper, like,
relaunch of Docker Model Runner.
So, has that already happened?
Is there, like,
some plans to like sort of re-promote it or what's the go there?
Yeah, in some ways it's already happened.
Like, it's been fully open source for like five months now.
Did my Discord just crash again?
What is happening with Discord today?
Discord! What are you doing?
Oh my lord
This tool
It's never been this crashy
What is going on
You know, I think Discord might have pushed a bad update
Maybe, yeah
I don't know
um okay so yeah we're talking about how the relaunch you went open source yeah just go from there
yeah um i can't see you on camera at the moment but no it's all good we'll fix it as we go
yeah um so yeah so in ways that that's that has already happened um
it's always been fully open source from one month in but we've started trying to make it kind of
be easier for people to contribute.
So one of the things we did is
the project was split into many different GitHub repositories
and some people were eyeballing the code
to see if they can answer further, which we love, by the way.
And they were like, oh my God, this is only a small portion of the code.
The rest of the code must be proprietary.
This isn't a real open source project.
And the issue was that it was split into so many different projects.
that people didn't realize they had to jump between several
several GitHub repositories to get the whole application.
So one of the things we did was we've centralized everything to just one GitHub repository to avoid that confusion.
We set up a community Slack so people can talk to us about features and
whatever. And yeah, so we enabled Vulcan. I think Vulcan is actually important for accessibility
because it means you're not, you don't need an Nvidia GPU. Whatever Nvidia GPU you have,
you'll be able to use it and get some level of performance. So it's kind of already happened.
The only one thing we're missing is we're going to do a blog post just to communicate all this.
It's always been the case. I just think we could have communicated it better because when it was
launched, it wasn't fully open source. And then we made it fully open source. This is like
months ago, it was made fully open source. But we kind of, we kind of forgot to communicate that
in an effective way, I believe. Right. And those sort of first impressions are very important.
Like, that's going to be how people see the project for a very long time and trying to
change that, even if it has changed in the actual project a long time ago.
you know, you've got to get people to reconsider what opinion they already held of a project,
and that's really difficult.
Yeah, yeah, yeah.
Vulcan is a good point.
I think we got bashed on release.
I think that's the top comment on that thread.
No, yeah.
Yeah, you know the thread I'm talking about, yeah.
Yeah, yeah.
No Nvidia GPU mentioned, no Vulcan, no Rockham.
Like, yeah.
Yeah, so a lot of these, a lot of things.
these issues have been resolved.
It's just communicating that.
And asking people getting involved,
star, fork, contribute,
talk that's on Slack if you're unsure of any of those things
because we want to see people use it in interesting ways.
Because, funny enough, we have a lot of users.
But, yeah, we brought up this point earlier,
but we're not quite sure what people are using it for.
And we'd love to build an even better tool, you know.
Right.
can understand what people are using it for, you can understand what areas to sort of highlight
and try to try to enhance so that the users you already have are, you know, happy with where the
project's going. Yeah, yeah. And the other thing I love is I love external contributors. I make
contributors that don't work for Docker because, you know, sometimes we might have meetings
or whatever. We might be very close-minded about our goals. But I,
I love seeing external contributors come into projects I work on
because they come in with a completely different perspective
and I'm like, oh, that's a really good feature.
We never really considered that.
Please go for it.
That's one of those advantages of open source
isn't really discussed as much.
Like people will talk about, oh, you can modify the code,
you can fork and all this good stuff.
But having a different perspective from the people
who are primarily developing the project
and sort of providing
and also providing just actual real world uses for it
because you might produce something
you might have an idea of how it's being used
you might use it in a certain way internally
but you realize that people outside
are doing something you never even realized
was something people wanted
yeah exactly
And the engine I thought, we spoke about earlier, Lama CPP is a typical example of that, actually.
Because initially, I think the goal was to run Lama models on MacBook Pros.
And now it runs on almost everything.
And it does those vision models that you were speaking about, like the doorbell webcam example.
It even does that now.
And that was not an initial goal of the project.
So it just, over time, developed into the sort of everything project and, yeah.
So, I guess, where do you see all of this going?
All of this AI tooling going and tools like Docker Model Runner, like what, I know it's
kind of hard to predict with how fast everything is moving, but.
Like, do you have any, any thoughts on where all of this is going?
I could give you thoughts, but, um, yeah, it's, it's hard to predict as well, you know,
this, this guy's the limit because the technology is so powerful and it's moving so
perhaps, who knows, but, but one thing I've noticed is the kind of first version.
I feel like we, we are kind of entering a new era because at the start, it was like chat GPT.
it was just like simple chatbots and now we're doing things like image generation as you talked about
that's the technical term for that is stable diffusion uh that's come on so much in the last year
you have rag you have people throwing documents at at lLMs you have people um giving it video feeds
um or still images and processing data as input you have um you brought up the voice right
recognition example.
There's actually a cool open source project for that, Whisper, CPP, which I won't go into it,
but that's pretty cool.
What's really powerful is when you combine all those things together.
We call that multimodal, which is really cool.
Hugging Face had this little mini robot, actually.
I can't remember their name, but it's really cool.
It's raspberry pie powered, and it's trying to do that actually.
take various kinds of inputs, it has a mic, it has, you know, cameras, etc., and you can do cool
AI stuff with that.
And outputs are changing, right?
Before outputs were just text, now they're like videos, images, sound, robots.
So, yeah, it's really hard to say.
One thing I have seen that's getting more popular is AI at the edge.
the stature was all about user centralized server from chat gpd open a i or whatever but now
it's starting to become a little bit more popular is i keep on going back to your doorbell example because
that's a really good example yeah it's it's bringing like a i closer to um the actual end user so you can
achieve those low latency because when you get that really low latency you opened up another whole world
of more kind of like a real-time performance.
So I don't know,
there's so many different things going on.
It's hard to pinpoint one.
I didn't even go into agents.
Agents is this whole thing that's exploring.
Go right into it.
Yeah, that's totally fine.
Let's talk about that.
Agents.
Agents, it's a concept.
Open AI released an agent toolkit like yesterday.
I can't remember the name.
but we at Docker found out really interesting actually because agent kit agent kit
yeah Docker has something called C agent and D agent which is this which is almost the same
concept but we just released it a little bit earlier so we found out interesting those projects
are also fully open source by the way but um agents is agents is kind of scary I used to work with
this guy Dan Walsh and he used to compare to the Terminator movie and
no the reason is
agents is kind of given AI
access to tools
so that might be
giving it access to run things in a little
sandbox or a container
or giving it
access to control
a certain portion of your desktop
so
it's kind of like giving it hammers and
screwdrivers and things like that
so
yeah
There's this really cool tool.
I think it's called co-pilot agents.
And you can tell GitHub to like, oh, just write me code about this.
Oh, yes.
Yes, I do remember hearing about this.
Yeah, it literally uses like the GitHub CI Bill servers, the Linux servers,
and it'll start writing, testing code, checking if it's done things correctly and keep iterating.
So yeah, it's rather than just like being chat put.
chatbots with a human kind of going over and back, just leaving it run free and giving it access
to tools and seeing what it does.
Yeah, that's kind of the concept of agents.
And Terminator is probably the extreme example, but you get the comparison.
Yeah, let's hope less Terminator and more, what's a robot movie that went well?
That's a tough one, actually.
Yeah, I keep coming with ideas and like, no way.
Then the second half of the movie happens, no, we can't use that one.
So I guess it takes the idea of, you know, we've had these like AI assistance for a long time now,
but they really have, like, they've just been this sort of conversation partner.
This is turning it really into an actual assistant.
It is someone who can help you out with a task
and actually try to complete it.
Yeah, without constantly giving it instructions,
it'll by trial and error trying to figure out things.
And I know this is going to sound like I'm trying to plug Docker everywhere,
but this is actually why containers can be useful also,
because obviously you don't want these agents doing things that they're not supposed to.
So often people put agents in containers because once you have it in a container,
you can restrict the set of privileges it has.
So you know it's not capable of escaping the container or having access to privileges
that it really shouldn't really have access to it.
Right.
It's another reason why containers and AI kind of gel together.
you don't want it to be
deleting files that you don't want it to delete
or you know things like that
no that makes a lot of sense then
so it's sort of this
like Docker being involved
in this is kind of just a
it's a sort of natural
extension of what
Docker was already doing
it's people want to the ability
to run these
to run these models and then run these
agents in a safe way and
Docker was already providing that
so it kind of just makes sense to sort of
bring this into Docker as well
yeah exactly
at the end of day these are
yeah it's just a new type of tool right
that's the way I see it and
but these tools are super powerful
so I think that makes
encapsulation even more important
in containers and boxes
and there's also other side reasons like Docker were really good at transporting huge files
which you kind of need to do when working with AI also so yeah the technology is kind of
fit in multiple ways so you've mentioned these huge files a couple of times how big are the
model files that people are generally running locally obviously again is different models of
different sizes but you know you've seen the ones that people are mainly downloading
Yeah, it really varies, it really varies on what kind of hardware people have and what kind of bandwidth or whatnot.
There's some really small models and small models are useful actually because you get that low latency because they don't have it.
That doorbell example is a good example of where you'd want a smaller model.
So models can be as small as 100 megabytes, but I mean the super intelligent models can be.
as large as 100 gigabytes.
That's not even that much.
Like, that's less than I would have thought.
Yeah, yeah, yeah.
I guess they don't go into
the terabytes yet, because when you load
these models,
you typically have to load them into kind of
V-Ram of sort.
Oh, right.
Yeah, so we don't really have
too many GPUs with a terabyte
V-Ram yet.
I'm sure that's coming, though, you know?
Certainly, yeah.
So that's kind of the reason.
Yeah, I don't know if you ever seen the graph of Nvidia's revenue.
It was like mostly gaming and then it hits about 2020
and then the data center just takes up the entire graph.
Yeah, that's crazy, yeah.
And they're going into all sorts of areas now like automotive
because AI is getting important there as well.
So yeah, it's crazy.
they used to be kind of a gaming company
and it's like
but yeah
now they're like
the biggest player in the world
it's crazy
yeah yeah
and as everything else grows
like you know
if you're the ones
selling the shovels
in a gold rush
you know
you're going to be coming out
pretty well
pretty well through that
yeah exactly
yeah
that's kind of what we're doing
at Docker yeah
we're not selling
GPUs
we're more kind of
yeah selling you shovel
so you can use your GPUs
so what sort of brought you into working on like working on AI stuff in this space like
why are you involved in this that is a good question
I'll tell you to show story I the boss I was working a lot in software divine vehicles
kind of like electric vehicles kind of self-driving vehicles and
all that thing. And my boss at the time said, you're, you're a somewhat innovative guy. I've seen
you come up with cool solutions in the past. You should look into AI. So yeah, I said, why not? I started a
site AI project. My initial goal was just to make AI much easier to run on local machines. And then I started
working with the Lama CPP community
because I found them really welcoming
which is yeah shout out to the LAMCPP community
one thing I love about LAMCPP community
it's a true community like there's people
from like a hundred different companies
and individuals working there
whereas some other projects they're kind of like
oh we're open source but you don't really work for us
so we don't care about your contribution
sure right
yeah so I got involved at LAMS CPP
and started hacking away there
but LAMCPP as it said
is very technical
and I talk container
technologies blended well
with a lot of that concepts there
so yeah that's kind of how it went
pretty much
okay okay fair enough
that's yeah
that was a
pretty smooth way to get in there
I guess like I don't know
I like to always ask people how they get involved
and whatever it is they're involved in
some people have like crazy 20 year backstories other people it's like yeah it's
like a good idea someone told me i should do it yeah no i'm the kind of engineer i've changed
my area of specialty like 10 times so i'm kind of used to just uh i'm gonna try something
different and try and get good at this niche and normally that works out for me so how long
have you been involved in this
Um, well, I've been involved in the open source world for years now, but the AI stuff, probably around a year at this point. Maybe a little longer. Yeah. Not that long. Maybe 14 months. I'm estimated. To be fair, that is a really long time in the AI space with how much things have changed. Yeah. And to be honest, to be honest, like, um, things move so fast is you kind of just got.
to keep learning and learning what's relevant at the moment, which is sometimes people find it
hilarious when they see a job post for an AI job and it's like, oh, you need 10 years experience
in AI, which is kind of silly because like AI 10 years ago is kind of completely different
to what it was now. So yeah, I don't think people should be like intimidated by AI because
yeah, it moves so quickly you kind of just you kind of just got to jump in and find what's
relevant at that moment in time.
In some ways, it's
not exactly the same, but it kind of feels
like
the web development space in the early
2010s, where constantly
people are talking about a new
web framework. It seemed like
every other week.
And if you try to
keep up with everything
that people are doing, you will have no...
Very quickly, you'll just be entirely lost.
Like, there's no way to know every single thing that was coming out.
Yeah, yeah.
No, AI is exactly like that.
There's too many, there's too many tools out there to know everything.
So, yeah.
But there are some common, like, core things that will stay consistent.
A good one is AI inferencing.
At the end of the day, you need an engine to run all the various variants of AI.
So they're going to continue, right?
but yeah
everything around that constantly changes
okay
um
uh
I
I had all these things written down
and we kind of like
as I said earlier we kind of like
hit on so many things so quickly
I'm trying to work out where do we actually go
um
um
shit
uh
Yeah, let me think
What could be interesting?
Yeah, we went on a lot of tangents.
Sometimes that's a good thing,
but sometimes you kind of lose your flow as well.
Yeah, yeah.
I'm sure this is great for the listeners right now.
I'm not going to cut any of this segment.
Yeah, yeah.
What that's kind of...
Somehow we managed to compress the entire AI space down
until about an hour.
I don't know how we've managed it.
How long do you normally go on this?
The hour's usually a short one.
Usually go for two, but anywhere in between,
there's totally...
Whatever ends up happening is totally fine.
Yeah.
I could go into things like,
Rag and all that, but I think we touched on them already.
Well, I'm kind of curious more about this.
So, Rag you mentioned is, like, using documents to supplement the AI, yes?
Yeah, so, yeah, let's go into that, because that's kind of an interesting thing.
When a new model comes out, they put a lot of work into curating that model.
Like, they have to throw that GPUs for, like, weeks or whatever in some cases to train the
model.
But the world is always changing, right?
So sometimes by the time you release
the model.
What is going
on? Oh my God.
Jesus Christ,
Discord.
Okay, I need to deal
with this Discord, man.
What the hell?
Oh,
my God.
Okay, okay.
I'm not even disconnecting.
It's actually just crashing.
Discord was a mistake.
Why am I doing this?
It's never been this bad.
Yeah, I've, yeah.
I want to use Discord from time to time,
and I've always wanted super aloeba.
I've used it for like 300 episodes.
And I've never had it crashed this much.
I don't know what is going on.
Um, anyway, we were saying rag.
That's, that's, that's going into, yeah.
Yeah, going into rag, sometimes by the time you use the model, the, the knowledge is a bit stale.
Or sometimes, if you're using it in a local setup, you might be using smaller or medium
models. So they don't, they wouldn't have like the knowledge of everything.
So Rag is like you can throw a PDF at a model.
Let's say it's a book or some kind of document or I don't know, a CV or whatever.
And it basically on the spot kind of inherits all the knowledge from that document you gave it.
So you could say like, I don't know.
This is a silly example, but you could say, what does Harry Potter say on page 42 of Chamber Secrets?
And it would know.
It's that kind of thing, you know.
Which is quite useful because actually creating a model requires a crazy amount of GPUs.
So if you can just, like, throw out a few documents to give it more knowledge, that's quite useful, yeah.
So you mentioned using, as an example, your code base as RAG.
So you can sort of, I guess, give it an understanding of...
how you tend to structure your code?
Is that sort of the stuff you'd be using it for?
Like what libraries you're using in the code base?
I'm just trying to work out how you would use this.
Yeah, this is another topic of AI.
Obviously, AI is a bit specialized, like, can get a bit specialized.
If you use an AI for code, sometimes people like to use very specific models that are suited for coding.
which I recommend you do sometimes
because that can really change how the AI models behave.
A good example of how people use Docker Model Runner
for coding, sometimes they'll use something called Quinn Coder.
That's one of the popular ones.
And that's specifically tuned to understand code.
And they will already have knowledge
of some of the popular libraries in the world.
But yeah, people tend to hook that up
to IDEs like Visual Studio code and say,
hey, the opening eye compatible servers here.
And then it'll traverse through all the code
in your Git repository and learn about that.
External libraries are another one.
There are tools that will go as far as find external libraries
on the internet to learn more about those, which
is another thing you see happening as well.
There's another form of rag, and I can't remember what it's called.
But it basically extends AI with, like, the knowledge of the internet.
So it'll kind of like use search engines, etc.
Right.
Further enhance its knowledge.
Typical examples of this are like Gemini from Google and Grock from X, they're called now.
You'll see not only are they being processed.
to be in the LLM, they're also querying around the internet to try and get more up-to-date data
and say the latest news story or whatever, the current stock price of a stock on the stock
market or whatever it may be. Right, stuff like that where even if you have the information in
the model, it wouldn't even be useful anyway, right? Like if you know the stock price from three weeks ago
and you want to know the current stock price, it's just like, it couldn't even achieve that task
even if you wanted to
without supplementing that data
Yeah, I know exactly
And you can
update models by retraining a model
But like I mean retrain the model
is a huge endeavor
Right, and for some data points
It's just not viable
Like current stock price, right?
You're not going to retrain it every second
Yeah, exactly
It's just, yeah, it's impossible
So
Okay, so basically
the models are trained in a certain way
and handle data in a certain way
and then you can bring this additional data in
and the model with how it's already trained
can pause that data and understand
I'm probably not using the correct terms here
but it can
yeah so you can sort of
throw the data
more data at the model and it can
fill in the gaps where the model
is not really going to be able to provide you
with the best results that you'd want to find
or if it's very specific stuff,
you can sort of guide it in a certain direction.
Yeah, exactly.
That's well described.
Another interesting thing you see people doing,
which is kind of related to the specialized models
that I brought up earlier.
We brought up the concept of agents
we're actually giving AI inferencing engines access to tools.
This is where you see specialised models being used as well.
For example, one agent might be responsible for writing code and might be using quincoder.
And then when it's done its job, it might tell another agent, now it's your job.
So that might be running another model that's more tuned to run its specific task.
And then that might give a task to the next agent, which is running a different model,
which is specialized for that task.
So that's kind of why agents kind of gets exciting.
So agents are relatively new thing?
I'm sure people have thought about the idea of having AI go off and do its own thing for a very long time.
but a new thing to the point where it's actually viable to achieve things.
Yeah, I think as people are remote...
Agents have been around, they've been around...
The concept of agents has been around for at least a year,
but it's only kind of like the last few months they start to get really powerful.
I suppose people are spinning up new infrastructure and software frameworks to actually do it properly.
So, yeah, it's very topical.
at the month. It's just a natural kind of evolution, I guess.
And I think what's really interesting with any of these tools that come out is,
like we're kind of saying earlier, there is this idea of how it's going to be initially used,
and then as you let people play around with them, you realize how people are really going to
sort of make use of them. I like to go with the example of, you know,
You look at what people are using SORA for, for example, and, you know, initially it's this, like, it's like a certain kind of, like, you know, movie style, hey, look at these interesting clips.
And it's very quickly turned in, I've noticed what people are using it now for is, like, funniest home video style stuff, where it's like, hey, my dad fell asleep and a bear climbed on top and it's, like, sniffy, like, just dumb things like that.
and especially when you throw it at social media
and you just see what people are going to
what ideas people have with these tools
and what they can really do with them
and how quickly it changes how people are using them.
I think it's just fascinating to watch what people are doing
when you hand them something like this,
something so powerful like this
and what they're going to even try to do with it.
Yeah, no, it's very true.
I know you're on X.com, I still call it Twitter.
I think Twitter is funny now because the gospel fact-checking tool on Twitter now is like,
At Grok, is this true?
You see that all the time on books.
Oh, I saw a post earlier.
At Groch is this AI generated?
Yeah, it's kind of funny.
Yeah, yeah.
No, I get it.
it, though. Like, especially with some of the, some of what I'm seeing out there, it is
legitimately hard to tell in some cases if something is. Like, obviously, you know, you have
your obvious ones. Yeah, you see a bear riding a motorbike. Yeah, sure, obviously that's not
real. But, you know, there's this, some stuff out there now where it's very close to the
line. And if you don't know what you're looking for, you're not going to spot it.
yeah
this brings up a good point
you definitely do have to worry about
kind of ethics
sometimes in AI it's very important
like I spoke about
the LAMCP community
which I work with quite a bit
there's this account
that appeared in the last week
and it's
leaving comments on all sorts
of pull requests,
code review comments,
but it's not a,
it doesn't call itself a bot
account.
It's a human GitHub account,
but it's clearly powered by an AI bot.
And we tag the
person, guy,
bot, we don't know what it is,
and say, hey, you're kind of
polluting our pull requests
with all sorts of crazy comments.
That's something.
sounds make sense, but 90% don't.
And he was like, oh, sorry, I'm, I'm doing this for my research project.
I'm doing a PhD or whatever.
And we kind of said, yeah, well, we might have to revisit how we do this or whatever.
And he kept responding.
And now we actually think even that response might have been a bot and not a real human.
So, yeah, there is a, there is.
There is a lot, like, we're, we're currently talking with the LAMCP maintainers how, how we handle that kind of thing, because, yeah, an AI bot should at least identify itself as an AI bot, we believe, because, you know, transparency reasons.
yeah this is
like this is a problem that we're
seeing sort of all over the internet
and I like you know
it's been known for a long time
that a lot of social media accounts
are powered by
you know fake actors
whether it be you know
you want to get into like government stuff
or AI powered stuff
whether it's it's not a
it's not an organic account
on YouTube you're seeing a lot of these
you're seeing a lot of these videos
where it's just
some generic
like I you know I'll see
like generic F1 footage
and then it's an AI voiceover
generically explaining
something that happens in
F1 and there's a lot of
especially as the voices get better and sound
more natural it's
it's harder to tell
if it's actually organic content
or if it's something that is just
being
just generated out and this is
this is like harmless stuff
but you know you get into
the ethics areas where people are
actually using it, like I was mentioning earlier, generating what looks like real CCTV footage,
right? Like that, that gets into a really dangerous area there.
Yeah, yeah. And yeah, you're 100% right. It's definitely something I've noticed in recent
weeks. Yeah, like in the past, it was very obvious to notice. That's definitely an AI bot.
Now it's, now it's starting to get unclear, which, yeah, it'd be.
interesting to see how we handle that.
But I will say
from the software companies I've worked with,
there's a lot of workshops
and
governance being formed around
that to make sure
we're responsible with our usage
of it.
Yeah, it'll be interesting to see where it goes.
Yeah, no, I
at the end of there, right, people,
if you give people a tool, they're going to
use it, right? If you give someone a hammer,
someone's going to hit someone in the head with it, right?
Like, people are going to misuse a tool
no matter what you, what guard rails
you try to put into it.
Like, yeah.
Yeah, this is a really tough problem.
I do agree that
AI content should be labeled.
The problem is when people just don't do it.
Yeah, yeah.
No, exactly.
I'm, I'm, if, if there's one thing I picked up this week,
it's like, yeah, people should definitely AI label their stuff.
I think that's really important
from an ethics perspective.
Well, funny enough, this isn't really,
at least in the open source world,
this isn't exactly a new problem, in my opinion.
For example, let's just take any random Linux distributions.
You hope people use Linux OSS for good things,
but I'm sure they're used for horrible things also.
It's the point you make tools.
You can't always fully control how people use it,
but that doesn't mean
we should shy away
from our responsibilities either
so it's...
No, I understand that.
I understand
why there's like, you know,
you sometimes hear these talks of
putting the brakes on AI
and dealing with the ethics issues
and then continuing forward.
And I, like,
I understand the arguments against that
because, you know, if,
if, you know, the US says
we're going to, we're going to put the brakes on this,
we're going to discuss ethics,
you still have other places that are going to be doing stuff
and especially when you're dealing with open source tech
like borders don't exist with open source
so these are conversations that need to be had
but you can't
it's very difficult to slow things down
yeah no the only
yeah I think the only solution is to these things
in parallel
maybe you can try and slow certain things down with regulation and I think sometimes that's a good idea
but as you said yeah that's I feel like I should talk to my lawyer now or something but no you're
right in the open source world you can't ever like fully control it so I think the smartest thing
to do is yeah try and deal with the ethics issues etc in parallel as
as best as you possibly can
because it's super important.
Yeah.
I don't know.
I am
apprehensively excited
for the future,
I think,
is the best way to put it.
Yeah,
I mean, too.
I think so many fields
are changing,
so many,
there's been a lot of discussion
about how so many fields
are going to be eliminated.
I don't know what,
like, anytime any big new tech comes along,
entire new fields are generated
and I don't think anyone's really
really knows
like you know 10 years so now right
like what is this
going to really enable
that wasn't possible before
what new fields of endeavor
are going to exist
because this tech was
further
you know
further researched
yeah I agree
and I don't
think it's going to be one thing in particular
I think it's going to sprawl out to a wide
or a of use case as if I'm
honest.
Yeah, because
even I brought this up earlier
in the podcast.
Do you call this a podcast or YouTube?
Yeah, yeah. Podcast works.
Oh my God, Discord.
How?
Why?
Why is it so bad today?
Oh my lord!
What the hell?
This is the technical of technical issues episode.
Why the hell?
Oh my god.
why
why is this code like this
well you're setting that up
I have a friend actually I think it's really interesting
I was talking about me speaking
I didn't stop the reporting by the way
so in case just in case you're going to say
them you don't want on the reporting
no that's fine
okay okay I've a friend
this is like this is
like another example of interesting
news cases. She's learning
Spanish, but like
she doesn't have any Spanish
speaking friends to
practice and speak with.
So she has
actually been literally with a
microphone and speakers
been speaking to an AI
bot to improve her
conversational skills in Spanish because she doesn't know
Spanish. Which I find
very interesting.
the
the duolingo application has been
adopting a lot of this
with their
learning modules as well
it's from what I can tell
very language dependent on whether
or not it's going to be
currently good
and obviously the language that were coming from
right so if your if your source language
is English I imagine German
would probably work better
than Hindi or Mandarin for example
where the language is very structurally different
but a language like German is
there's a lot of similarities between the languages
so I'm sure that the translation between the two
would be vastly easier
I'm not a I'm not a linguist
in case anyone couldn't tell but that would be my assumption
I actually do really like AI as a translation tool
because let's say you use like the
non-AI
translation. I'm sure they're getting AI
features now, but let's say in the past
used Google Translate
it would give you the translation
literal translation
but sometimes you wouldn't
understand why the sentence
is formed that way or words
that have a double meaning of
so you can literally
ask the AI
why'd you do it that way? That doesn't make sense to me
and it can explain it
So I think it can be much more powerful for translation from that perspective, you know.
Actually, that's a really good point.
I'd never thought of that.
Yeah.
Because you know, like, translation tools to give you literal translation
and you're kind of not really sure the nuance or anything around.
Yeah, especially in languages where there's nuance that is lost in the translation, right?
Like, you know, languages that have.
gendered nouns for example
right like that's just not a thing that exists in
English so
you kind of lose out on
on that additional context or
where the way that words are said
is different for males
and things like some languages have like
a distinction between like
the gender of the speaker so
like what like which sorry
I said yeah
like most of the Latin ones
and
if you're like a direct
translation in English you just don't have that
but if you have the ability to ask
why it's translated like that
and sort of any additional nuance
that is lost I just never
really thought of that as a as a use for
I guess that does make a lot of sense though
yeah
hmm
no I found it useful because I'm learning no languages
and like Google Transit
Yeah, I feel like I never really fully understand why it's translating in a certain way.
But when I do it, I, yeah, I find it a lot more kind of informative.
And I'm like, oh, that makes sense.
And oh, that's not quite the way you'd say it's in English.
Oh, that's, we don't really have a word for that in English.
I'm glad I've learned this new kind of thing that we don't really have the concept for in English.
because a traditional transit tool
wouldn't really do that for me.
Hmm. Hmm. Okay. I will
I'll definitely have to keep that in mind.
That's, yeah, I'll, yeah, I'll definitely keep that in mind.
This is, again, this is, this is one of these things
where it probably wasn't initially thought of
that this would be useful for this.
Like, this would be a useful endeavor.
This is a thing that people would really want.
But as it's further developed and as people use it,
you sort of uncover new use cases that weren't entirely obvious.
Yeah, no, 100%.
This is kind of tangential, but Google Meat was kind of bringing in things like this.
I think now on Google Meet,
if I'm speaking to you
and you are speaking
let's just say French
for example
and I don't know French
you can like
turn on real time subtitles
so there might be
English subtitles under your French
so two people could actually kind of speak to each other
even though they don't speak the same language
I did not know they did that
I think Google
me released it but in very
limited, like only one or two languages
for it?
As of three
yeah, as of three months
ago, there's like some articles about it.
Why were more people not
talking about this?
Okay.
That's really powerful
because two people don't speak the same language
can speak.
Yeah, I know they were doing
they were doing something like that with
the pixel phones.
which I think like that
that makes more sense as an initial use
because how often are you going to be in a meeting
with someone you don't speak the language of
but if you're in person and you're like doing a live translation
you're like at a restaurant
you're not really sure how to order something for example
like that as a first usage makes a lot more sense
but it makes sense
you would also repurpose that
for their other
communication platforms they have as well
Yeah
And another thing I find very useful
Almost all the video tools
Or Zoom has it and Google has it
Is they have like
AI not ticking
So if you turn it on
At the start of the meeting
It just like sends everyone an email
With the Google Doc
At the end of the meeting
This is what we're discussed
These are the action and items
these are the people who agreed to take on these action and items.
And that's actually scarily accurate.
And you don't have to touch the keyboard.
I wish I had that when I was at university.
I wish I could have had that listening to a lecture.
That would have been so nice.
Yeah, it's pretty cool.
I use the Google Meet version.
It's quite good.
But I know there's other versions for Zoom, et cetera,
also that are also pretty much.
They're just as good, you know.
I was, um, I graduated university sort of like just before the AI stuff really kicked off.
So I kind of like missed out on all of that.
But I do know people who are, who are teachers today, who are kind of like,
there's a real struggle with how to, how to integrate this tooling, because kids are going to be using it.
No matter what you try, they are going to use the tool.
So it's a matter of how to...
How you're going to integrate it.
And, like, you know, it's...
This isn't a tool that's going away.
So it's also...
Like, it feels like a disservice if the schools aren't trying to educate people on how to use them.
This is another one of those areas where I really don't have an answer on how you would...
how you
would properly
fit this in
and it seems like
everyone's kind of
trying different things
and nobody really knows
exactly
what's going to work out
in the long run
yeah
I've talked about this topic
with certain people
there's a couple of schools
of thought here
some people think it's great
because it accelerates your learning
rag is a good example
you can throw
a lot of notes and rather than searching through all the pages or doing manual searches
of the document to find certain part of your notes you can you know you can just write a prompt
how would i do where is what is and you get the response instantly so people in that school
i thought think it's great it's just accelerating or learning i suppose the other problem people
a kind of more negative school
and I think both are somewhat
at school of thought
is it's kind of killing people's kind of creativity
and ability to think on their own
because
some people just are just getting answers
from AI and just using that as is
and they're not really learning anything
as a former
enduyer of Spark Notes
whenever I had to do a book review
I can say for a fact that
if there is a way that I can
do less work on a book review
or other sort of assignments
that I don't want to be doing, I would have done so
and I'm almost certain people are going to...
Look, if you're a 15-year-old, right,
do you want to write a book review?
No, you're going to find a way to get around it.
And I actually, funnily enough,
I think you should use it to accelerate your work.
But I think there's a little bit of personal responsibility there.
Like, if you are using it to accelerate your work,
at least make sure you're ingesting information and, you know,
you're actually learning because at the end of the day,
there's going to be an exam at the end of the semester.
And if you haven't actually been learning, you're going to fail, right?
So I think some of that.
a lot of that is personal responsibility as well you just have to be but i i i do think people i
actually i'm a firm believer you should be using it to accelerate you should be using it to accelerate
your workload to an extent because um it's like i remember my maths teacher in school saying
i'm i'm pretty good at arithmetic but i i remember him saying to students in the past who's like
who would overuse the calculator or maths class and he
And he said, I'm 35, right?
So there wasn't really, when we were very young,
there wasn't widespread, very widespread usage of phones.
So he was like, you're not going to have a calculator in their pocket always when you're older.
And I'm like, we kind of do.
You know, so, yeah.
I do agree with, like, the personal accountability.
But at the same time, right, like how much personal accountability is, like, a nine-year-old.
going to have, right? And
at that age,
you know,
there's also the argument, like,
if you are going to introduce it in schooling,
at what point do you? Do you save it
for sort of the latter half
of the education? You're like high school time?
Do you bring it in early on?
Do you, like, what effect
is it going to have for someone from the very
start of schooling to have an assistant like this?
Is it going to be beneficial?
Does it, like,
stunt learning
because of how new this is
no one has any long-term knowledge of what's
like what happens if someone goes
throughout their entire schooling
and has access to this tooling
like what like no one
no one knows yet
yeah yeah no one truly knows
I'm sure there'll be plenty of studies
done with it
yeah over the next decade or so
these studies are almost certainly already
being started now.
Yeah, I'm sure.
Yeah, it's a good point.
There probably are some situations,
plenty of situations
where you do have to limit AI usage.
A final exam is definitely more.
Yeah, yeah, sure.
Yeah, but once you move into, you know,
once you're in higher education, right,
like if you're paying for your university,
if you're going through that
and you're trying to skate through and not learn anything.
Look, I don't know what, like, you're just wasting your money at that point.
I don't know, I don't know what you're doing.
I know, you know, the, like, the grading system in university here,
like a passing grade is a P, so there's this sort of mentality P's get degrees.
You have these people who just skate through on the lowest possible grade they can get to
pass and you're like you always have that right like it's not a new thing but you know you're
going to see more of you're going to see more of that where there's people who i kind of just there
because i don't know why they're i don't know what they're doing there they just you're always
going to have the the you know whenever you have a bell curve you always have the lower end of
the bell curve right like there's always going to be people that just don't put in the effort
yeah yeah but at the same time
I think people who are willing to use this tooling effectively and augment their work and sort of speed themselves up.
I think for those people, we are seeing, you know, really powerful benefits come along with it.
I saw an example from the author of Curle a while back where he's sort of, he's taking a lot of issue with people submitting garbage security reports to kill because they run a bug bounty and, you know, when there's a bug,
Bug Bounty, people are going to try to get the bug bounty.
Yeah.
He recently had someone outline, I think it was like 23 or something, some like large number
of security errors in Kirl.
And this person had sort of augmented their work using AI.
And these were all legitimate issues and that probably would not have been found,
at least within any reasonable amount of time, without going through this route.
Yeah, yeah.
Yeah, I read some of his
post as well. I didn't read
it sounds like you read them
in more detail than me.
I remember he called some of the
reports AI slap like it was
Yeah, a lot of the reports
he's not, like I
get it right, like if you run a bug
bounty, people are
really going to try to do
anything possible to get that money
and I can totally see why that would be
really, really
annoying to deal with
yeah yeah yeah this goes back to the code review bots um wouldn't in projects i work on
when we have a code review bot it's not actually a big deal because if we deem it not important
we can just ignore it i do feel bad for the curl maintainer daniel because if people are
you know pushing a cv yeah he's almost mandated to to look at it and treat
seriously because it's security issue
so yeah
that does suck for him
yeah yeah
it's not just
curl curl's just a really
a really loud example but maces
had a problem with this recently and
a bunch of projects you know
I get why people
want to contribute to open source but you still
need to have an engineering background
like oh my god
Discord disconnected again
Jesus Christ
Oh my lord
I
I need to check what this is
man
this is so annoying
I
do apologize
for all the technical issues today
like
oh that's fine
For all we know, my end could be doing it.
I don't even know.
No, my, my, my Discord's actually just, my entire Discord client is crashing.
So, yeah, I'll just do a reboot after this, and it's never going to happen again.
This is going to be one of those, one of those issues where it's just ephemeral, and I don't, no, nothing identifies what the problem is.
Yeah.
What did you last hear me say?
The last thing I remember is we were talking about
Co.
Oh, I was saying projects like Mesa have issues with this
and a bunch of other projects have issue with this as well.
Like, at the end of the day, right,
if you're going to be contributing to a project,
you do still need an engineering background
to really understand the code.
and it seems like a lot of people
and I get why these tools are really powerful
a lot of people are sort of
thinking they can just ignore that step
and thinking they can just rely on the tool
to do everything for them
where it's not just an assistant
it's a replacement
and there's also a lot of companies
who sort of feel this way as well
and you know you're seeing
hey how much
do we need all these people
can we get rid of them
replace them entirely with these tools
and I really don't
know in the long term what that's
going to look like if that's going to work
a lot of people are banking on it working though
yeah
yeah that's why
we rough this up lots of times
already that's why I would call
AI an accelerator because
for most
for most use cases
of AI today you do
kind of need a human to
say well this this is
AI crap, or, and, or, you know, that's, that's pessimistic.
There's a lot of cases where 90% of what the AI does is really good, so, yeah, no,
there's very few cases where it outright replaces humans.
It's more an accelerator tool yet.
Yeah.
Um, is there anything else you wanted to touch on, or is that, like, we've kind of, like,
just gone, again, random, down to some random tangents.
No, there's nothing springs to mind.
As I said, my main goal here was actually to spread awareness of Docker model runner
because I think it's a really cool tool that makes local AI easier to use
and abstracts the way the complexity of engines like Lama CPP
and gives you a way to push and pull models around the place
so you can deploy your AI models and many different kinds of hardware.
And that was my take.
That was the main reason I came on because I'm really trying to grow that community.
So if people are interested in like learning about AI enhancing AI's capabilities, please contribute, Star Fork.
Yeah, so if people do want to get involved, where can they go to?
Is there some sort of place they can discuss stuff?
Is there just the GitHub?
What is there?
the main central link
and we're cleaning up to read me
as I speak now
to make it super clear
the main place is github.com
slash docker slash model runner
also another
place I would recommend people go
you'll see in that read me
there is a link
to a slack
if you just want to casually chat
with people like me
to learn more
and get yourself comfortable
So they're kind of the places to go, really.
So what are you sort of, what are you sort of hoping to get out of, I guess, building a community?
Like, obviously, you know, having all people contribute and understand what the code is, but, like, what are some general goals you have?
Um, first of all, I don't think there's kind of a true.
First of all, I don't think a tool like this really exists,
and I think it's quite useful.
So I'm just trying to spread awareness of that,
because anyone I hear that uses Docker model runner,
they're like, whoa, this stuff is really cool.
How come I never knew about it?
I didn't even know I could run models locally
and push and pull them around the place.
So there's that side of it.
Obviously, I work for Docker,
so it's increased usage of Docker tooling,
which is the obvious thing.
And the other thing is, yeah, I think what truly makes, you know, great open source projects is contributors coming in with a new perspective and enhancing a project for new use cases.
And I really want to see that because, yeah, I want to see what people find it useful for.
and I wanted to grow and become, you know, a great open source project.
Fair enough.
Yeah.
Is there anything you would like to direct people to besides that?
Anything you want to mention?
Anything you want to highlight?
No, not really.
Just if people are interested in AI and they want to go down a lower level again,
You brought up Mesa, just to people who are curious about AI.
I kind of see Lama CPP as the Mesa of AI, as I kind of said before.
So if you're interested in really Docker model runners,
it meant all high-level go-lang,
so it's really kind of like contributor-friendly.
But if people are interested and really getting down to the nuts and balls,
I recommend looking at the LAMCP community also because that's kind of where
that stuff happens.
And I'm also pretty active there.
So, yeah, we don't really have a slack
or anything like that in the Lama CPP,
but we're pretty active with our conversations
via GitHub issues and pull requests and discussions
and things like this.
Okay, fair enough.
Apologies for all the technical issues.
I don't know what's going on with Discord today.
um yeah uh
if you want to come back on
and talk about more stuff hopefully uh discord's either working or we just
you know use google meat like your original link had anyway
no i'd love to and yeah feel free to reach out i've watched a couple of your
youtube videos uh already i watched a couple of you and matt miller
from the fedora community and what that so yeah i i was really happy to
I'm on this podcast, actually, because it's my favorite open source on.
Oh, that's cool.
No, Matt was definitely a lot of fun to talk to.
Yeah, I, I, there's a lot of really cool people in the open source space that I feel like
people don't, you know, you might see blog posts from, but you don't really hear them
speak about what they're working on.
And I don't know, I like to, I like to bring people on to actually talk about what they're
involved in because, you know, a lot of people have a lot of interesting things to say.
Yeah, I get it.
Well, thanks for having me.
Yeah, no, it's a pleasure.
Nothing else you want to mention before I do my outro?
No, that's it.
Okay, cool.
You guys can't see me, but whatever.
It is what it is.
My main channel is Brody Opson.
I do Linux videos there six-ish days a week.
Sometimes I stream there as well.
I've got a gaming channel, Broody on games.
I streamed twice a week there.
Right now we're playing through Yakuza 6 and Holo Night's Silk Song.
If you're watching the video version of this,
you can find the audio version on basically every podcast platform.
Search Tech Over T, and you will find it.
If you'd like to find the video version, it is on YouTube, Tech Over T.
I'll give you the final word.
How do you want to sign us off?
I never tell people they're doing this, so it's always fun to see what they do.
Exactly.
I'm not a runner.
Perfect.
I don't know.
