Programming Throwdown - 177: Vector Databases
Episode Date: November 4, 2024Intro topic: Buying a CarNews/Links:Cognitive Load is what Mattershttps://github.com/zakirullin/cognitive-loadDiffusion models are Real-Time Game Engineshttps://gamengen.github.io/Your Comp...any Needs Junior Devshttps://softwaredoug.com/blog/2024/09/07/your-team-needs-juniorsSeamless Streaming / Fish Speech / LLaMA OmniSeamless: https://huggingface.co/facebook/seamless-streamingFish: https://github.com/fishaudio/fish-speech LLaMA Omni: https://github.com/ictnlp/LLaMA-Omni Book of the ShowPatrick: Thought Emporium Youtubehttps://youtu.be/8X1_HEJk2Hw?si=T8EaHul-QMahyUvQJason: Novel Mindshttps://www.novelminds.ai/Patreon Plug https://www.patreon.com/programmingthrowdown?ty=hTool of the ShowPatrick: Escape Simulatorhttps://pinestudio.com/games/escape-simulator/Jason: Cursor IDEhttps://www.cursor.com/Topic: Vector Databases (~54 min)How computers represent data traditionallyASCII valuesRGB valuesHow traditional compression worksHuffman encoding (tree structure)Lossy example: Fourier Transform & store coefficientsHow embeddings are computedPairwise (contrastive) methodsForward models (self-supervised)Similarity metricsApproximate Nearest Neighbors (ANN)Sub-Linear ANNClusteringSpace Partitioning (e.g. K-D Trees)What a vector database doesPerform nearest-neighbors with many different similarity metricsStore the vectors and the data structures to support sub-linear ANNHandle updates, deletes, rebalancing/reclustering, backups/restoresExamplespgvector: a vector-database plugin for postgresWeaviate, Pinecone Milvus ★ Support this podcast on Patreon ★
Transcript
Discussion (0)
programming throwdown episode 177 vector databases take it away patrick
welcome everyone to another episode we have a great topic today.
I'm excited to learn.
It's legitimately something that I hear all about, but I don't know too much about.
So we're going to have Teacher Jason joining us here in a minute.
That's right.
We had Teacher Patrick for compilers and interpreters.
And then we have now Teacher Jason for vector DBs.
I feel like we should have called ourselves professors.
We missed opportunity there.
Oh, that's right. What's,
is there anything higher?
Uh,
distinguished,
uh,
a Mertis professor,
a Mertis,
a Merti.
I,
I,
okay.
What's the highest level professor?
Uh,
you know,
can you,
can you,
can you,
can you,
can you,
can you,
can you,
can you,
can you,
can you,
can you,
can you,
can you,
can you,
can you,
can you,
can you,
can you,
can you,
can you,
soul emperor of the knowledge universe.
Oh, but what if it thinks it's the smartest it's gonna tell you a lie because like it doesn't want you to be superior okay anyway no sorry oh you know i have this plan now when um you know when
i get these calls from people telling me that like my my taxes are overdue or i should buy car
insurance or whatever what i've started doing is asking it questions about particle physics.
And when it gives me really good answers,
I'm like,
aha,
this is chat GPT.
This is not even a real person.
A particle physicist could work in a call center.
Okay.
Yeah.
I mean,
I had to go looking for a car and they legitimately like you know want your
phone number whatever so i just have one of the like you know uh i guess like it's a voice over
ip number but it doesn't actually like ring through whatever and the place one of the places
i went has called me twice a day every day for going on two weeks now and i've picked up zero
of the calls and i'm just like they really want to sell me a car but
it turns me off i'm just like why are you pestering me this much like i feel like just text me i will
so you know my dad worked for uh a few different car dealerships over the span of about 30 years
and so i learned the trick to buying a car and i'll share it with everybody if you're a programming
throwdown listener for free you don't have to be a patron although we really appreciate it um if you want
to buy a car here's what you do you show up you test drive the car in person so they know you're
a real person you're not another dealer trying to buy a car and then resell it you know you're there
as an individual buy car you prove it to them right uh and then you leave and you don't come back until you have everything finalized so everything
is over email or phone you negotiate the price you compare with other dealers and uh you should
expect this to take about a month um but if you do that you will save anywhere from like 10 to 30 percent of the price
wow i last time i bought my car i felt like it was pretty different than now and that like the
inventories are much worse um but yeah in general i did something similar to what you were saying
for the last time about a car it was super nice i just showed up like all the paperwork was done
you know just basically it still took you
know close to an hour which is ridiculous um but you know pretty much it was all ready to go so
yeah that i i agree i'm going to attempt to do this one again i haven't gotten to the negotiating
stage yet so but showing up i guess they're trying to you know do the cap show with me
so they're trying to call me back and make sure the phone number is real yeah i haven't bought a car since uh 2020 and so
i've i've missed this shortage but is the shortage still a thing or is it over i don't think it's a
thing in the same way but the inventories on like desirable stuff is still not great and um at least
at the places i was like i think it varies dealer to dealer how much they've you know or brand to brand i guess how much they've recovered um so uh yeah i think it's
it's still on the pricing is is tighter but i i don't know i will still die on the hill that
i think we are stupid for having gotten ourselves and i guess this is us-centric people tell me in
other countries it's not like this but i have no idea why you go in and the only place in life remaining is a car dealership
where you have to like negotiate.
I'm going to bring in my manager.
I'm going to bring in the finance.
We're going to change everything on you.
Like it's a total like scammy thing the whole way through.
Oh, yeah.
That's another thing.
So the thing about finance, you ever play these mobile games where there's like 24 different
currencies you know like i play uh fifa on mobile and there's like there's literally five five to
ten different currencies at any given time and some of them get phased out and replaced with
other currencies and all that um but there's one currency that like you could pay real money to get
and that's ultimately the only currency that really matters because that's one currency that like you could pay real money to get.
And that's ultimately the only currency that really matters because that's the one that's tied to USD.
Right.
And so if you start going down the finance route early, you end up in this boat where they're like, oh, your monthly payments are the same. But like they really they added a year of payments, you know, but each month is the same.
And there's just there's just like too many variables and it's like pretty easy to get trick um and so that's
that's another thing that my dad told me was like always like negotiate the absolute dollar amount
and then come in at the very end and do financing after like you're ready to pay the amount whatever
so this has turned into a
little bit of advice from jason and a lot of whining from patrick uh but yeah buying a car i
look forward to i will say from what i understand tesla does it a little different i look forward to
you know some future for my children or whatever where like you literally just figure it out online
maybe you show up to test drive once and then you just buy and it doesn't matter if you buy from the one on the west side of town or east side of town it's
all the same like yeah every other thing we go to and do so yeah exactly can't come soon enough
all right and i'm sorry for all the car dealership people that i probably you know
very frustrating to or putting out of business. But you know,
yeah, if you're a car dealer, message us, we'll take we'll take your hate and read it on the air for everyone to hear. Oh, no, no. All right, we're jumping into news.
All right. So my first article talking about programming things here,
cognitive load is what matters. So this is actually a GitHub article, which is a kind of interesting way to do it.
So the person can update it over time.
And this is something that I picked,
I will say a little selfishly
because I say this all the time at work
and there'll be a link in the show notes.
But just talking about in development,
that one of the most, in software development,
one of the most overlooked things
is how much cognitive load you're putting on the engineers and this can come in a couple forms and
they kind of call them out like sometimes that could be the business surrounding the code um
i work at a place that does like lots of software engineering so it's it's a pretty like respected
part of the business but a lot of businesses you may only have one, two, three,
four software engineers building something. And so the bulk of the business isn't really the
development or the software. And so there's all sorts of like, hey, you really need to work with
the accounting team and this team and that team and who talks to whom and this is using this
database and this person is using this other one. All of that has to be loaded into your head.
The other thing I always bring up that people forget is you go to college and you learn
all this stuff about compilers and programming and you show up on the job and they give you
like an unconfigured Linux box.
And maybe you've only ever used Windows or you've only ever used Mac OS.
And now it's like, hey, here's this Linux box.
Congratulations.
Like configure it, install your packages.
Oh, you need to like, you you know move your pc to a different
location and plug in all the cables but wait a minute like is that really like a requirement
listed for the job that like you know bash or zsh or any of these shell scripting and so the amount
of cognitive load that goes into um programming and and then you know, I think kind of the bulk of the article
is when you're selecting paradigms,
abstractions, where to divide stuff,
how much to copy and paste,
there's a balance to be reached.
If you make, you know,
stuff too small and modular
or too monolithic,
both introduce a lot of cognitive overload.
So there's the balance there but actively thinking about how do we build this in a way that makes it easy to reason about
so things that are similar are kept together um and not abstractions for abstraction's sake
is an important thing the one that always gets up like we program in c++ and it's like everybody
aha look i wrote the ternary operator
which is you write a you know conditional expression like an if statement you know a
greater than b and then you do a question mark and then the first thing is what happens if it's true
and then you do a colon and then you put the second thing okay just write an if else statement
or an if then like like you are i have seen i've been programming for a long time i see
yes i can parse it but i
don't parse it as fast and i always have to double check like wait a minute it's the true is the first
one false is the second one and secondly when we bring on a new person maybe you say hey we're all
experienced people uh and this comes up in a later news article but when new people come on
how like maybe they aren't used to that so using and this is matters for some languages more than other.
But using like the subset of the language that is most comprehensible and most widely
used gives you benefits and that it's very easy to bring new people on.
It's very easy to teach.
And we were working with a code base where they like put in a ton of macros.
All of the, you know, variable names are single line,
single character, variable names, all this stuff. And they may be efficient in programming,
but you cannot work with them, you can't work with that code base. And all this is about
that, you know, thinking about cognitive load, and really programming in a way that reduces it.
Yeah, I've seen so much, like, kind of bad behavior with macros and metaprogramming and C++. It just makes it so difficult. It's like, oh, you know, I'm using like the double hashtag thing to like concaten this this call to a preprocessor macro to create a
class and i'm doing that call like 30 times and so none of these classes actually exist anywhere
so like you know it messes up the ide and everything and and you know at some point it's
like look if you were to copy paste this class 30 times well first of all there should be a better way of doing it but
let's say you just couldn't figure it out just copy paste the class 30 times at least that way
someone can see kind of the point and the pattern and all that if you if you hide it behind a
reprocessor macro it just becomes like a tar pit that people get stuck in and i think there are
other languages that the
same thing happens or even you know the way of doing object modeling and i've seen stuff in java
with you know getting into dependency injection and reflection at runtime and just like it makes
it really hard to like hey i have a call site here what code is being executed like how many
hoops do you have to jump through if you don't know
off the top of your head
to find the actual line of code invoked?
And how confident are you
that you found the actual line
that is being invoked?
And if that is not
a trivial amount of time,
you should be really,
really thoughtful about
trying to reduce that
as much as you can.
Yep.
Yep.
Yeah, totally.
Yeah. Dependency injection is uh in java i remember um i forgot what the tool was called but there was this tool where like you would maybe
it was part of the spring framework but you would pass in um uh like a config and it would create
classes from this config and feed them into your
main function. And, and so it's like, oh, like, this object is already filled in. What the heck,
like, how do I change it? Oh, I it's actually in an XML file. It's like, what, you know,
it just became really hard to use. Moving on to diffusion models are real-time game engines um i i don't know i don't necessarily
agree with the premise of this but the thing about this that blew my mind is the video
so basically what people did is they took these like image generators and text image you know that whole
like a vision transformer kind of technology and they basically used it to say if i have this image
of a screenshot of a video game in this case they use doom because it's open source and people use
it on everything so given this screenshot of a Doom game,
or maybe even like a handful of screenshots
of the last second of this Doom game,
and given a command like shoot a gun or go forward or whatever,
then generate the next screenshot of the game.
And so they basically are trying to encode the entire doom game like the art assets
the engine everything into a diffusion model um and so then they proceed to play this game
and it just gets it just gets kind of weird like uh you know points of discontinuity are really difficult for these models like so for
example um when you fire the weapon sometimes it doesn't fire um you get the fire animation but
like you know that the world isn't affected um and other times it is um sometimes like you walk
and a door is just kind of like flickering in and out of existence um the whole thing is wild
uh uh you really uh we're not doing it justice here over audio uh but go to the show notes and
watch the video um it's just pretty uh i've seen a lot of like weird things with doom like they
passed it through daydream and you have this like really weird uh uh you know kind of like trippy kind of
uh uh you know view of it um but this is really taking it to the next level like you're defying
the the laws of the game are not are not are not first order anymore so it's like sometimes
the world works the way you expect sometimes it doesn't and it's just kind of it's like sometimes the world works the way you expect. Sometimes it doesn't. And it's just kind of insane to watch.
There's a couple games that kind of do something that it reminds me of.
So I actually have seen this one before.
And like you're saying, it's kind of weird because
it looks like you're playing the game 90 percent of the time or whatever but
then occasionally something happens uh that is different so like you're looking one way you turn
around and face the other way and then when you turn back the world is completely different like
you're not in the same same place or like the world has shifted because it's just it's generative
so it's just doing the whole thing sort of like on the fly so you almost convince yourself it's right and there's a game i think
it's super liminal i don't know if you've played this that has stuff like this like you you can go
through a door and then when you come back through the door like you're not in the same place it's
sort of i don't call like non-euclidean or whatever. And it kind of plays with the sort of perspective.
And so if something changes size in your perspective,
it changes size in real life as well, in the game as well.
And so there's all these things you can do
where it sort of subverts your expectation
of a game modeling reality.
And it doesn't mean it's not fun to play.
It's just, it can be very fun i don't know in
these cases maybe it probably wouldn't be very fun um but it it is just like this weird twist
to your brain yeah totally i'm pretty sure you can't get much beyond the first or second level
in this i mean this is like some academic exercise um but uh uh but yeah i just think there's something there's something just kind
of wild i mean it's it's uh definitely something i've never seen before where someone tries to
like imitate an entire engine of a game with a neural network and then see what happens
people with too much time on their head no i'm just well you know people with too much time on their hands? No, I'm just kidding. Well, you know, people with too much money in their hands, right?
Because of the AI hype cycle.
Oh, there we go.
Yeah, I mean, maybe that should be episode 178.
So we'll get sent some our way.
We'll try some of this stuff.
Yeah, that's right.
Yeah, we're currently raising, we have a new model.
It's called CLAWED, C-l-a-w-e-d it absolutely does not just
call claude under the hood please give us 50 million dollars oh i yeah i i've seen some of
these accusations going around about people just calling other models and saying yeah okay anyway
we're moving on uh this is the uh the ai gossip i guess yeah that's right my next article is a kind of related to the previous one
your company needs junior developers uh and so again i guess a bit by is this something i i
push for all the time i work it's a strongly held personal belief and uh and so um the idea here
a lot of places i think you can fall into the trap of hey we want to hire a senior person the you
know inside joke i guess it goes around is like you read resumes and it's like hey you need 10
years of experience with you know a tool that came out two years ago and it's like everybody
needs to hire the utmost expert in the field and in practice that's not really true most of the time
that isn't it's not to say that there's never a good idea to hire senior people but i feel
by default most folks have this belief that like hey hey, we're high velocity, the way to like,
keep this up is just keep adding people at the same level of the team or above the level of the
team to bring them up. And in some cases, that may be true. But I have found that often,
there's a number of first and second order effects that come from hiring junior developers
you know and i'm not even talking about like budget you know money reasons i'll leave all
that aside hiring junior people and being able to train them like helps them sort of uh they they're
more uh plastic towards you know how stuff is being done and learning if, you know, they're inquisitive.
And it's not to say that they won't express their own opinions or bring changes. They often will
and sort of say, hey, I learned it this way or why not this? And they ask a lot of questions
and through those questions, I think can come learning. But they don't come in with as many
preconceived notions of this is the only way to do things, or every time I've done it is this way,
or hey, we first need to reconform everything you've done to my way of doing it. And then
we'll start making progress. But also putting the, you know, more senior developers into the
mode of teaching and by teaching, they're learning, they're having to synthesize concepts and simplify
stuff, like we were talking about before cognitive burden if you're the only one coding maybe it doesn't matter so much because the context is always loaded in your head but the
minute you have to share that context with someone else who's also making changes right now you start
to like help each other and improve and um also i think making sure that the expectation is known
to senior engineers that there will be junior engineers and that they
will bear some of the responsibility for helping to onboard teach and you know show them not that
they have to become managers but that everyone on the team has a responsibility to help you know
those who are trying to learn or asking questions and i think that again these first and second
order that you have someone who is contributing who can grow and can like you
know sort of become the the team member that you need also gives like a sense of a narrative arc
to the team right like that we were here and now we're there and like look at the growth
and so for just a host of reasons i think it's a really important thing to be hiring junior
engineers to bring them on to have them grow and learn. And then also sometimes projects go through phases and the project of your phase may
be, hey, it's not insane growth right now. We're not able to hire a ton of people. Because of that,
some senior engineers may need to or want to move on to other companies or other projects or other
things because there isn't a lot left for them to sort of grow, but there's still a lot to be done. And this is when junior engineers can come in. And for them, that work is super suitable. Whereas for a senior engineer, it may be difficult to justify, you know, the continued sort of like, you know Patrick and I are really invested in growing engineers. I mean, the whole point of this show when we started it so many years ago was that and sort of the future of the field.
I think that engineers have been hit with some really big whammies.
The first thing they've been hit by is just exploding college tuition costs, right?
Then you get hit with the pandemic and all of your mentors being, you know, not being around.
And now there's all the layoffs and companies saying, I was in a meeting. There was a Austin like engineering leadership meeting that I was at.
And this person was saying that, like, hey, you know, we just need to get rid of like 80% of our company like Twitter did and just keep like the 20% super productive people.
And the thing is, that will work in the short term for your company.
But there's a tragedy of the commons there where like you're kind of like, it's like when you buy roses from HEB or from your grocery store and they look great,
but they don't have roots. And so, you know, that these roses will not live very long.
They're already dead.
Yeah, that's right. Like, like the cat is dead in the box. And so,
and so like, yeah, you're without roots, like it's just not, it's just a temporary facade. And so without roots, it's just a temporary facade. And so I do worry a lot about junior devs just not having a safe place to land.
What do you think is going to happen, Patrick?
Do you think that if people keep focusing on senior devs,
do you think that people are going to get dissatisfied?
There's going to be a generational gap or uh what
i mean leaving aside uh fundamental pivotal shifts from you know ai or whatever um i think it's a
slightly different topic i think this is where uh and and this will get somewhat i guess a little
maybe controversial but i think this is one of the
benefits of having capitalist markets is like i think companies that will hire junior devs and
invest in training them will have a market advantage and so you know two companies participating in the
same so that company that you mentioned just as an example like i want to lay off all these junior
people they're not really pulling their own weight right now. We overhired our fault, whatever.
Those people are going to go somewhere. And if not, they might self-organize and kind of learn
their own thing and build up a competitor or do something better because they kind of see what
wasn't working at the previous company. And so this is where I'm a believer that, you know,
having the competition, companies that do this will end up winning out over companies that don't. Companies
are willing to teach and hire. Again, like I said, I think they have better practices. And that's not
to say in some extreme example that, like you said, somebody could do this and in the short term,
it works amazing. But then, and you know, hopefully it doesn't happen. But what if you have a team of
super high functioning, three people, they're working at 110% capacity. it doesn't happen but what if you have a team of super high
functioning three people they're working at 110 capacity that doesn't make sense i know
but they're working 100 capacity super optimal you think you're a genius and then one of them
uh i mean i work with people we're getting not that old but up in age have a heart attack
have like you know family emergency and they literally stop working for six months and there's no redundancy there's no backup like is your company going to like close close shop
because you were that's kind of i guess the equivalent in financing is people do leverage
they use all this leverage and it works great until it doesn't and so you seem really genius
i guess it's equivalent what you're saying about the roses already cut. Yeah, it's really genius.
You're making all this money on leverage.
But as soon as, you know, the tide turns even a little bit, you crash really fast because everything is magnified.
Right, right.
And even if like you might say, well, I will let other companies hire and and skill up junior engineers and then i will just hire those people
when they become senior engineers but i think the problem with that is you need to transfer
knowledge within your company you know i don't think it's not as fungible as people would hope
and so uh and so that i don't think that strategy will work either yeah i mean you still have to teach
them this specific most companies still have quite specific things so even if you say i will let
other people train them and then we hire them you don't have a culture of training people because
you don't do it very obviously but hire them back and then you're going to have a difficulty getting
them to adapt and so you're going to go through a lot more pain and suffering yep yeah exactly
um all right so um oh on to mine so mine are a couple of libraries that let you do
um real-time text-to-speech which is now becoming basically an open source commodity which is kind
of amazing so so you can actually um and
i'll add a third one to it that just came out uh this morning but um you can actually have a
conversation with an ai um an open source ai in real time uh and uh that just blows my mind um
like you can you can uh talk to it it'll talk back you can interrupt it
and say hey i want to talk about something else like all these things that we saw in that um open
ai demo about maybe six months ago a year ago are now becoming a commodity um and uh here's here's a
few examples of libraries that do this um but yeah, I think the progress is just absolutely astounding.
I mean, it's actually really hard to keep up with the technology.
It's just moving so fast.
But this is a really big milestone, and a few different libraries have kind of reached
it in more or less the same time.
With the speed of some of this and like you were joking
about earlier but i kind of find myself thinking about it sometimes companies there's a lot of
untapped potential and sort of developing companies around some of these things and just
you know replacing uh stuff within bounds but i think defining those bounds is really important
like you know making sure that there are some i don't call them guardrails or whatever around how some of this stuff is done
but also it's got to be somewhat difficult to you know say you're going to build and do an
integration with one of these you know models that came out and then to just like three or
four months in potentially a completely different way of doing or a different architecture just moves
the goal posts and then you got to decide like do we ship on this old thing or do we like completely redo
and move to the new thing and then you end up in this constant cycle of like uh you know oh man i
just got completely you know blown away by whatever the next release from whomever is
yeah right i mean you know a good example of this is like the text to image stuff where you know
wasn't really that commercially practical because the images were just not that good i mean they
were fun to look at but like you'd obviously know that this was computer generated and and it
basically you know within the span of 12 maybe 18 months it went from from that to like you literally like
maybe okay so i take that experts can tell but like you and i can't tell if you see a photo
of a person like a profile photo we can't tell if that's a real person or not and and that is just
wild i mean if your product um was was kind of based on one thing like you it just opens up a
ton of product opportunities to your point yeah there was all that stuff like oh they're not good
at text earrings won't be symmetrical you know too many fingers and you're right they were kind
of funny to look at and they looked plausible and then now you're right i see pictures posted
in some of the sites like oh look at this ai trash and i'm like i i it you know if i really
stare at it it looks off but there's a lot of commercial photography that's airbrushed or
whatever that also can look off or not natural so i think it's like you said it's a sort of uh
you know whatever maybe some experts saying these aren't good but they actually are very convincing
a lot of times um and it's really only the fact that the situation is so improbable like oh a cell phone
snapshot of you know an elephant parachuting into my backyard like okay probably that didn't
actually happen so it must be fake but not it's not actually something in the scene that tells
you that's not realistic right right and uh the thing too is yeah i mean if if if you
if you're in the mindset of i wonder if this is real or not because someone told me it's fake
well then you're gonna have a very discerning eye right but if you're just on the internet browsing
you're not you're not you know you're not in that type two mode of thinking,
you're just consuming content. And, and I think you will just, I think many people will just look
at fake images and just think they're real. Uh, like all these stock images, you go to like some
random startup that just paid like someone on Fiverr to build a website. Right. And there's
like stock images of people just
having fun around a computer i bet if if you replace those with fake images most people wouldn't
know yeah and there are some ethics problems i will say as well looking at everything with
the discerning eye causes his own problem because there's plenty of actual images taken that look
implausible or seem unrealistic as well and there are totally
legit for variety of things so you could enter conspiratorial whatever like yeah i saw a video
of this bad thing happening but it's it's just fake like it doesn't look real there's no way
that really happened oh interesting yeah i mean that's a that's a different direction i hadn't
thought about is if there's a real event, like we're at 912 today.
Yesterday was 911.
So like, yeah, I mean, if I'm reading Twitter and I see like some big terrorist attack, my first reaction might be like, that's a fake picture or something.
You know, it's kind of wild.
All right.
So moving on to book of the show. Patrick, what is your book of the show?
Book of the show is not a book. Cheater. Warning, tool of the show is not a tool either. So
yeah, we got to rename these sections, man. All right. I thought I must have talked about this
before, but I Googled through the show notes and I didn't see it.
So if I have, I apologize.
I have never heard of this, so I think you're good.
So this is one of those, I know it's like fairly popular now,
but I run across fairly popular,
like million subscriber YouTube channels all the time
that I've never heard about before.
It's just a million isn't what it used to be or something.
I sound like an old man now.
All right, anyways, so Thought Empporium is a guy who does i guess like just call like hobbyist science research my hero like i wish i wish i could do this stuff like i wish i could
make money just poking around like random stuff in my lab and like i want to say close to probably
half of the stuff he does like i have at one time or another done played with or sort of so it's very fascinating there's a bunch of uh sort of one-off projects um and
long-term projects that he's working on including topics like genetic engineering um he was recently
trying to grow neurons to play doom so how do you like grow neurons in a petri dish hook them up to electrodes hook that
up to a computer and make them play doom and not your garage that's not exactly right because this
garage is kind of fancy but not in a multi-million dollar biomedical research lab either the one i
linked is just him and his friends like sometimes come up with memes i think they released one
yesterday i was like you know everybody always jokes what did you take this photo with a potato um just him and his friends like sometimes come up with memes i think they released one yesterday
it was like you know everybody always jokes what did you take this photo with a potato
and so he actually built cameras out of potatoes um not like a potato hooked up to a camera like
the whole camera is a potato um and so like the potato is a pinhole uh and you use the flesh to sort of get a negative
image okay anyways the one i linked to in the show notes you can check it out is a single use
emergency thermite hot dog cooker so what would you need it to do if you needed to cook exactly
one hot dog like in an emergency situation in a single use container and it's just like him figuring out
like in a moderately serious engineering like how would i what is the best way to do this like how
would i use heat to cook the hotdog should i cook it like by itself should i cook it in a liquid
you know is this actually going to work how safe is this you know could i make the could you make
a product
out of it and so if you've never checked them out uh definitely do there was a lot of biohacking
initially like you know i'm going to inject the genetically modifying materials into myself and
do crazy stuff um nowadays a little less but another one that my kids really liked was, is mummy yummy? So let's go through the full process of making like a mummified chicken from the store, grocery
store, like legit.
We're going to go into reading hieroglyphics and making our own hieroglyphic statements.
Do the full mummification process with authentic old recipes and then taste various parts of the mummification
fluids and things and determine if mummy is yummy is that healthy i mean or is that sanitary so
nowadays it would be formaldehyde and you definitely eat formaldehyde but it turns out in ancient egypt
they didn't have formaldehyde so you actually could use most of the stuff so i won't spoil but if you think it is intriguing to find out if mummy is yummy you're in the right
place okay yeah this is awesome i think i think uh this is one of these things where you know
40 hours of my life later i'm like how did i get on this youtube channel uh mine's the opposite so
i watch this and then i find myself on ebay like how could
i replicate what this guy is doing like i have so many questions yeah the youtube to amazon pipeline
is strong um my book of the show is a project that i started with a friend of mine um called
novel minds and the idea was there's all these public domain books.
A lot of them are required reading for kids.
Even the ones that aren't, a lot of them are really good,
like War of the Worlds, like classics, right?
So I thought, could I use AI to turn them into sort of,
I think the word is graphic novels.
I mean, not, not literally a
comic book, but just a picture book. So there's just a lot of pictures, you know, the cover is
totally redone. And so we did this. So, um, by the time folks listen to this show, there'll be 30
books on this site. Uh, I think at the time of the recording, there's maybe seven or so. But you can basically grab them all for free.
If you're on Apple, it will just open in Apple Books naturally.
If you're on Android, you'll have to get an EPUB reader, but there's plenty of them for free.
And you can read all these classic books with tons of pictures.
We tried to do roughly a picture every like 400 words or so.
And there's a bunch of technology. I actually gave a talk on how this works that's recorded.
And I posted the talk on LinkedIn if folks want to watch that for more technical depth.
But if you have kids who are in school who are have to read these
books, you could give your kids or your friends, your friends, kids, copies of these versions,
all the text is exactly the same. We we haven't changed any of the words. We just added a ton
of pictures. I did not know about this until right now. So I'm actually scrolling through one
at present. This is really cool.
The style is very consistent.
The main character doesn't seem to be,
I'm looking at around the world in 80 days,
but the main character doesn't seem to be super consistent between them.
But there used to be books when I was younger,
they were like,
there's like every other page,
there was like a hand-drawn illustration.
And then they were the like synopsis.
Like they were kind of,
I don't know,
10% of the full book or whatever. they were kind of i don't know 10 of the full
book or whatever they were like concise oh yeah yeah so okay a couple of things so one uh the
version that patrick's you're looking at right now is is an older version oh okay we did fix the
consistency um in the newer version uh we basically um used something like a vector database.
Foreshadowing.
But we kind of encoded a representation of each of the characters.
And then when a character is referenced, we added that representation back in.
And so now the version that will be out when we publish the show actually is pretty
consistent um but yeah you're right the the first the very first version um was like thematically
completely inconsistent like one picture would be black and white the next picture would look
like a cartoon next picture would be like a hyper real, you know, and so getting consistency.
The second phase was where we got a pretty consistent art style, but the era was all over the place. So like, you know, it'd be 80 days around the world, which was record, which was set around the 1890s, I think.
But there would be like a high speed rail picture or a guy with like a rocket launcher you know
and so what we did there was uh we also injected the era we figured out the era by scanning the
book and then we inject that into every single image prompt um so yeah it's been a really really fun exercise um my goal is ultimately to get these
into the kindle store although it's pretty difficult because there's a lot of scammers
who try to push either fake ai books or public domain books onto the store and so we're running
over a couple of barriers just trying to prove
to amazon that we're legit um but uh yeah that's it's a fun little book project and i've had an
awesome time actually reading these books some of them are a little difficult to read because
they're rather old uh but uh a lot of them are awesome this is cool i'm trying to determine how
many of these i've read before so can i have
another request can you like run the generator to give me summaries so it's like all the pictures
but only one percent of the words so that i could uh claim to have read all of these books much more
quickly than actually read all of these that's an interesting idea you know i wonder you know the reason why uh the reason why i didn't
change the words is i felt like if it is required reading through someone no yeah there's a test or
something bad idea but yeah but but you're right i mean for people who aren't reading it for a grade
or anything um we could definitely like summarize every paragraph or do something like that there was a so what i want okay so this is really dumb okay we can move on but uh growing up there
was this show wishbone where it was a little dog do you did you watch this show no i never heard
of it okay oh my gosh all right this is good all right i'll make it short so just this dog who
goes to same same idea like various famous you know plays or books and the the dog is like the you know he goes
into the story and then he you know it's like a dog is the main character or whatever in the book
but they're just acting out the book in a 30-minute show and so you get the arc of you know
the book the main idea the gist in 30 minutes for kids just like teach them about different worlds
and different you know a moral story or whatever but i okay that won't make sense if you know but that's what i want is like the the wishbone like
the hey i could hold a conversation with someone and unless we were doing like a book club they
would believe that i had read the book and i will have benefited from the you know analogy or the
you know whatever the you know book is trying to get without all of the dialogue that
happens between you know characters that like you said if it's required reading it may be very
important to read i'm not trying to denigrate old books but i also know that i won't commit that
time yeah it's a great idea um maybe by the time you publish a show i'll be able to do that i mean
one simple thing would be you know take each chunk
which is around i think it's 4 000 characters take each chunk uh that's being used to make an image
and summarize that chunk down to like 200 characters
all right i'm excited this is cool i don't know how you're going to become rich and
and uh become the financier of the uh the biomedical research lab but uh
we'll work on that i'm talking to was it tony stark
all right time for tool of the show yeah if you want to help us oh sorry go next
tony stark you can subscribe to us on patreon uh patreon.com slash programming throwdown okay
that was actually legit sorry i was skipping over but that is really cool i'm excited i have to move
on because i feel bad about myself for not contributing to society speaking of which my
tool of the show not being a tool and a way to uh not waste time to uh improve your uh your brain
is escape simulator i think i mentioned before
playing some escape games i recently actually picked this one up in a humble bundle i i didn't
you know kind of think too much of it but it's actually like it was pretty cool and just so much
content in this game um so escape simulator it's kind of what it says on the box it's an escape
room simulator and they themselves have
a lot of content for levels that you can play but they also have a big fan base that is making
games of various complexities for you to play and i just i really enjoyed doing it i've been
i've been playing i've not come anywhere close to beating it because i refuse to you hint use hints um and my daughter
sometimes sits and we play together and uh you know i personally not my kids are pretty smart
but like way to be impressed with your kids is have them shout out an answer to something and
you can't figure out why that is the answer you try it and it works and then have them try to
explain to you like how the puzzle had that answer. They can't like, they just intuit it,
but it happens way too much.
And then you kind of feel bad
and you're like trying to look at the hints
to like understand why.
Anyways, so Escape Simulator,
definitely check it out.
Works on macOS, Linux,
at least in Steam Deck it worked and Windows.
Yeah, this looks super fun.
I'm definitely gonna grab grab this is it like uh
um like a split screen or is it internet co-op uh internet internet thing so yeah that's the
only thing i couldn't figure out i just have one steam account i do have two computers
but once you're playing on one steam library won't you play in another one
so we just sort of sit together and like collaborate on one a one
player game got it yeah that makes sense very funny could you also just buy it under two steam
accounts and that would probably work fine you know there's a new thing not to digress too much
but steam has like a new feature now called family mode where um you can actually say like this computer is a family
member and then what will happen is you can both play different games at the same time if you try
to play the same game at the same time steam says hey uh you're gonna have to pay uh you know to buy
a second copy but then when you do that that you can play together so like it kind of handles all of it this is good tips okay all right i'll be back in a few minutes
all right my tool of the show is the cursor ide and you can go to cursor.com to grab this so
this is basically uh github co-pilot and all of these things just on steroids at the next level. So when you start up Cursor, it asks if you want to import all of your VS Code extensions.
So it's literally a fork of VS Code.
I imported all my extensions.
They all imported just fine.
The thing about Cursor that really blew my mind was it has like autocomplete that spans the entire file.
So, for example, I had this app that was a WSGI app.
It was basically a kind of a REST APIi app and i was converting it to flask
and so one pattern that i had to go from is the old app you would return a json object
and it would have a status key and then the http status code so like 200 if it was good 400 if it was now forward etc etc um and then and then
your your payload and so most of the time in these little handlers i was returning an object where
you know json object for like status 200 payload is data data was some object i created right
in flask you don't return anything from your handler, the handler gets past this response object, and you do response dot status 200. And then response dot data payload, right. And I changed it, there was there was a file that had a bunch of these handlers in it. So I changed it in one as i was changing it it let me tab complete to do that like it like like i was i was putting i put like
response dot status like about halfway through the word status and it figured it out it's like
what you want to do is sas 200 data payload and then delete the return thing and i hit tab that was cool then it like zipped down
like a third of the way through the file and it's like you want to do it here too don't you hit tab
and like i just hit tab tab tab tab tab and it just did all of it and that blew my freaking
mind like i've used you know copilot and amazon q and stuff and they they they do like they do have
some of those magical moments like within you know pretty close to where your cursor is but i've never
seen it like this where it's like oh you also want to do this thing at this file over there
uh it just like it just like kept chaining these things to do and uh um and he got it all perfect like you know there was weirdness around the brackets
and all of that you know is is kind of complicated because i was adding this handler and so it's
actually adding a bracket but then deleting the return bracket and it like figured it all out
flawlessly um uh it was really impressive um again i think we've talked about this in the past if you're at a
company make sure you get the right approval we were just talking about junior engineers we don't
want you to get fired if you're a junior engineer and you get fired that will make us sad get the
right approval but this is an amazing tool that's very cool is it in the works for it to be able to do it across files in your
project as well like if you wanted to refactor all of your files to you know use flask or whatever
like is there a way to like invoke it to do that or not yet i don't think so i'm trying to remember
uh it definitely jumps around in the same file.
I don't think it has a way to skip other files yet,
but it might.
I've just started using it a few days ago.
Okay.
And what is the zip down?
So like it zips down and then like,
if you don't want to do it,
it just returns you back to where you are. I feel like I'll get a whiplash.
Oh, interesting.
That's a good point. You know uh yeah i guess i'm not totally sure so the
times where it did this i took it took their suggestion every time i think maybe if you don't
take the suggestion maybe you could hit the up arrow on the key and get teleported back i sent a new developer metric which is how often do you accept the ai suggestion
oh definitely yeah i'm sure if they're not measuring that they should be and yeah i guess
you're right they should pass that measurement along to the company no i'm saying you're like
as an engineer you get a score which is like the more you accept the ai thing the like
worse off you're not adding anything they could just replace you with the ai so you need to like
then there's this whole gamesmanship right because then we'll figure it out and we'll just write what
the ai said but change one character so it doesn't count and then our metric for you know novelness goes up yeah well you know it's
interesting i mean i think that um yeah i mean you know the next level up from this would be
for me to say you know convert this kind of like raw lambda handler into flask and it would have
just figured out because the thing that i had to provide
was the knowledge that when you return things flask doesn't care like it's it's it's not going
to use that return value and so like just like running this code on flask means that you would
just error 500 everything right and so like i had to figure that out and then once i started typing the solution then it was done um but yeah i mean at some point maybe
it will that's that's almost like a segue uh or another thing is you know junior engineers now
also have to compete with ai uh or or maybe maybe not maybe it makes them on a more level playing field. That's still kind of an unknown.
I don't, yeah, I don't have a,
I think this is a watch this space.
No, no, that's not how they say that.
This is like a watch the future.
I don't know what will happen.
I think where I'm sitting now,
but this has turned out to be wrong before,
but like you said,
it's sort of the intent still needs to be provided and and sort of guided but people who lean in on
it probably see a big improvement before everyone else and then everyone else sort of adapts or not
and it we call it the same thing but it's not really the same thing so you know it's like
programming a website is technically programming and programming an operating system is also
technically programming uh they're not really the same thing like some parts are similar but lots of
parts are different i think we'll just end up with more divergence so there's some things you're
doing like you're saying hey i'm looking for another web framework that may be faster and it suggests to you to go to
you know whatever um but then there's also times where you might just be straight up like
this is something that's not been done before and it needs to be written for the first time
yeah yeah that makes sense.
It's going to be a wild time. I think these tools will save people a lot of busy work, which is good.
But it also means that you will have to memorize the code base faster
because you're not going to be sitting there staring at it while you do busy work.
All right, on to the topic of the show vector databases so patrick have you ever used a vector database no all right but you have used a database yes um do you uh have you ever put uh like anything other than text in a database
you know like have you ever used a database for images or yes what did you use for that do you
just stuff an image into my sql or what protobuf oh you put protobufs in the database oh yeah
really so but that's just like garbage, right?
Like, you know, it's unintelligible.
Wow, bro.
You're coming at me.
No, no, no, no.
I don't mean it's garbage quality.
I mean, when you look, when you run a query, it's unintelligible.
Yeah, you can't index it sort of properly.
Yeah, you're right.
Yeah, because like when you query, I think if you certain databases they let you have like a
column be a jpeg and the database kind of knows it's a jpeg and so when you query it it will
actually show you pictures oh that's cool can you query oh no no it's not i was gonna say could you
query stuff about it like i'm looking for jpegs where the subject is read uh you could with a vector database oh
okay here we go i'm ready i'm here i'm leaning in all right um so yeah my mind is still is still
blown by the so okay hang on i gotta dive in a little bit so if you put a protobuf in a database
right yeah then i guess most languages have protobuf support
but you would have to like you know you know uh interpret that protobuf before you do anything
with that data so like you know your your sql query goes straight to something that like converts
the protobuf to like yeah i mean you can use just more like a key value store where the value
rather than a relational database right so it becomes more like hey i mean you can use just more like a key value store where the value rather than a
relational database right so it becomes more like hey i have these keys and then just have the lobs
got it okay cool uh okay so let's build a scaffold up to vector databases so
um okay people like starting from the beginning, how do computers represent things? So, for example, you open up Notepad, you start typing some letters in Notepad, you hit save, and it saves as a text file.
And so, not dealing with Unicode or anything like that, just putting that aside.
A text file is basically a bunch of bytes where, you know, each byte represents one letter.
So, the letter, there's an ascii table
you can look up and so i don't remember off top of my head i think isn't like 97 a lowercase a
or something it's been a long time but uh but but you know each letter and each symbol you know
equal sign apostrophe they all map to numbers and in at least the ascii format there's less than 256
of these uh or uh different things and so that's one byte in a computer and so your document is
you know a byte for each of these letters including the spaces and all of that um so that's how you know traditionally how you could
represent text in a computer so a computer obviously doesn't have a concept really of the
letter a you know in hardware um that's the way it works um and for images the same kind of thing
you know there isn't like any lithography uh is that the right word? But, you know, there's no like dark room in your computer.
Like there's no physical images in your computer.
The way it works is, you know, each picture is broken up into pixels, which is like a really tiny square of an image, hopefully so tiny you can't even tell it's a square and then for each pixel the color
could be represented as you know the amount of our red green and blue representation in that pixel
and so you can imagine um you know if you have a picture a bunch of these triples your red green
blue triples and you have one for every pixel um and you know
the width and the height of the image so you know like what that whole rectangle looks like and
that's how pictures are represented in the computer um does that make sense anything you want to add
to that patrick yeah i mean i think you're right i think what you're trying to point out is the computer
doesn't have a native understanding of most of these concepts so it just has operation bytes
and like data bytes and the data bytes need a an encoding they need a scheme and the humans are the
one that imbue meaning to those yeah that makes sense so um okay so then we'll get into like let's say compression so for example
you know you might have a document that has the same word a bunch of times it's like chapter one
chapter two chapter three and so every time you have the word chapter that's going to be uh what
seven bytes right to store all of those
characters um but if it's always like chapter chapter chapter section section section there's
like a lot of repetition there and so you could actually have a smaller file by taking advantage
of all of this repetition and so one example that most people are taught in undergrad
and I've long since forgotten is Huffman encoding.
I don't, do you remember Huffman coding?
Yes, yeah.
It's something to do with like trees, right?
Like you build a tree where-
And the prefixes.
Yeah, maybe you explain it,
because I've totally-
No, no, no, no, that was good.
Yeah, yeah.
So you like look in your file
and you have to look at the whole file at once and basically decide like how am i going to assign bits to the most common
prefixes of numbers and each branch of the tree represents sort of like going down that way and
so it can represent a bucket of the prefixes that get concatenated together and ultimately
that way most the most common pieces use the fewest bits and then the
longer deeper parts of the tree encode the the less common pieces yep that makes sense and so
um and so you could do something similar for images right you could have some kind of delta
encoding or other kind of things that take
advantage of the nature of the image. But you know, for images, really, you know, you want to
capture the sort of nature of the image, right? So it might not matter that like this single pixel
in this one part of the car is like a little bit more blue in the image than it was in real
life like it doesn't that doesn't really change the nature of the image and so now you start
getting into what are called lossy compressions or lossy encodings so for example one of the most
common uh is where you take these um uh various kind of repeating patterns so such as like a fourier
cycles or cosine cycles you know the cosine wave right you take these kind of waves
and you compose them at different frequencies and amplitudes on top of each other so um
um so for example uh if you imagine like a zebra so zebra is this like really sharp
kind of wave where it's it's white and then it's black and it's white and it's black
and so if the camera zooms in on the zebra now the sort of frequency is getting
smaller right because those stripes are getting bigger the camera zooms out or the zebra's running
away or something now the frequency starts going up and so what you do is you you you have a ton
of different uh waves of different frequencies and you have there's an algorithm
that tries to figure out how you can compose like add many of these waves together to faithfully
reconstruct the image and if you can do that then you only need to store like the definitions of those waves like what were the
amplitudes and the frequencies of those waves um instead of storing every single pixel
i mean i think just to completeness there you need infinitely many but if you have some lost
criteria like some amount of thing you're willing to give up,
then you can basically chop and only take the top X most important frequencies.
And then you'll get back most of the image.
Yeah, yeah, exactly.
That's a really good segue to the next part.
So now we'll jump into embed embeddings right so um we talked about ways to kind of explicitly represent
something so if you have a document and you want to store it you explicitly represent you each
individual character um but many times what we want is to know like the overall essence of a document, like,
is this document about cars?
Uh, or is this a long document or does this, is this document at a really high reading
level or at a kindergarten reading level?
So these sort of like soft attributes, uh, those sound like kind of yes, no questions.
But really, there's a lot of ambiguity there, right?
It's like there's certain degrees of like, how much about cars is this or, or how much
of this document is at the kindergarten reading level.
So, so, you know, these kind of like soft attributes. And then doing things like give me all the documents
that are roughly a yes for this question.
So it's kind of like a lossy attribute system.
And the best way that we have today
to be able to answer questions like that
is through embeddings.
So the goal of an embedding,
it's not directly to recover the original document, as we talked about before. So you know, with Huffman encoding, the goal is to compress something so that you can later decompress not to get back the original thing the goal is to be able to answer
questions around nearness you know if i have a query or or or a you know a concept or the query
might even be another document right what are the things near that query um semantic nearness that's
the goal and so because of that embeddings can be you know
very very lossy because they don't really have to reconstruct the original content
and so there's there's two kind of key ways that embeddings are created so one way is through
contrastive methods so the idea here is i might say, these two documents were written by
kindergartners. So I know I'm going to be asking a lot of questions about reading level and reading
skill for these documents. So I have documents created by kindergartners, I have documents created by adults, and I randomly pick
two documents. If they were both created by the same age group, then I want their embeddings to
be closer together. If they were created by a mix of a kindergartner and an adult, if one document
was a kindergartner and the other document was an adult then i want to pull those two documents
apart in this space and so uh and so at the very beginning of your embedding process all the
documents are just randomly projected and so they're just scattered all over the place but
after you ask yourself and answer this question many, many, many times,
you end up with hopefully, you'll end up with sort of two, you'll end up with many spaces,
many clusters. But each of those clusters will be either a kindergartner document cluster,
or a grown up document cluster because of all of this pulling and pushing.
Does that make sense?
Yeah, I mean, I think the pulling and pushing... Now, does that happen in a subset of the dimensions,
or are they guaranteed that even in another dimension, they may be far apart?
Yeah, so when you pull or push, you do it in all the dimensions.
So it's almost like there's little
thrusters on these two documents, there are points in embedded space, and those thrusters are
pulling them together in every dimension simultaneously, or pushing them apart. And so
when you do this embedding, you know, a single dimension has no real meaning
because the only meaning was constructed from the distances,
the pairwise distances.
So you can't say dimension three is how many times they said the word dog
or something because it's all like a composition of all these dimensions.
So you need all of the things you're trying to do this
with up front though um you need uh okay so when you're when you're training you provide a bunch
of training examples and those have to be labeled right so you have to know this is kindergarteners
adult um once you've trained the model then you can give it new documents.
And you could even ask it like,
is this new document a kindergartner
or an adult document?
And then it would use a vector database
to look at like, what are the nearest neighbors?
And if most of those are kid documents,
then that's one way that you can answer that question.
Got it.
So that's one way that you can answer that question got it um so so that's one way you give it a bunch of pairs um of things that you know are are you know should be close together
or should be far apart um there's another way to do it where it's kind of like a fill in the blank right so for example
um you might here's an example let's say i have three documents written by a kindergartner
and i give a model the first one and i give a model the third
one and then i say hey generate the second one um maybe maybe you ask a kindergartner you know
write a book about dogs write a book about goldfish write a book about cars and you present
the ai that you're training you know two of three books, and you say generate the third one. And in the beginning,ner and you go to the ai and say hey
you were wrong like here's the actual answer um and they do the same thing with like adult books
right and and it turns out that like if all you're doing is kind of filling in the middle
whether it's the middle of a sentence the middle of a collection of work or whatever it is because
you're just filling out the middle and you already
have the middle like you know the right answer you don't actually need any humans in the loop
so like i could just take all of wikipedia and i could give every sentence to some ai that i'm
training and say like hey like fill in the beginning of the sentence or the end of the
sentence or the middle and and I know the right answer.
And so when it doesn't give me the right answer, I kind of push it, give the right answer.
And so this is called self-supervised models. if if i give it the end of something and tell it to reconstruct the beginning
then when i actually try to use this model in real life i have to give it the end of something
but if i'm using this to do like chat gbt i can't do that like you can't say
all right you're given the end of what chat gbt wants to say, give me the beginning. It doesn't work
that way. And so usually you have to, you're forced to go in the other direction. So you're
like, you know, you purposely hide the ending completely. You artificially hide the middle
and you give the beginning. When the model tells you the middle, you correct the model. And so
because you're never able to look at the future, these are called forward models,
because they can only take the past and predict the future. And so GPT is an example without the chat part is an example of just a pure forward model. So, you know, given like a bunch of things you said, this is your context, what's the next thing that that's going to be said, and it's trained on, you know, terabytes and terabytes of text. And then now it generates that token.
But along the way of generating that token, it constructs an embedding.
So it constructs an embedding first, and then it uses that embedding to predict the next token.
And so if you're not interested or not, you don't need that second part, you can just
take the first part.
And now you have an embedding of the words that have been said so far.
And so this is why as like a byproduct of making these chatbots, the companies can also
offer an embedding service is because they need that internally.
So you can offer it externally.
And it's valuable for the reasons you're describing, is now if i have text maybe it's not a chat
it's just i have text and i want a representation of it that i can ask questions about then i can
get the embedding and then do whatever i want with it yep yeah exactly right. And there are vision transformers that work very similarly where you take a picture, you hide part of the picture from the AI, and you ask the AI to recreate that part of the picture.
When it does, you have the real answer because you are the one who hid it yourself.
And you compare the AI generation with the reality
and you correct the AI.
And similar to language, a vision transformer, you know,
has a step where there is an embedding representation.
So you can take a picture and run like half of the vision transformer and now you just have this embedding where the picture is going to be car and asked the AI to redraw the car,
all those pictures are going to have very similar embeddings.
But the picture where like you deleted a person from who is in the background of your family
portrait, and you wanted the AI to like fill that in with something.
All those pictures will occupy like a different space in
the embedding um so um yeah i mean this is you know this gets pretty kind of difficult to explain
over over the air but any questions about that part of it no i i mean i think it makes sense so
the idea is you're getting a vector which i guess we talk about like it's a list of numbers and
those numbers represent like where in this space it is and the hope is that for these processes
that you learn the ways in which things can be similar or different
right right and so um okay so now let's talk about how to correct the ai so let's say you
let's say the sentence was you know the quick brown fox jumps over the lazy what is it lazy
dog or something yes yeah so let's say the ai generates the brown fox jumps
over the lazy dog so got everything right except for the word quick right but because it skipped
the word quick every word after that is kind of wrong right it's like shifted right so you have
to develop all sorts of different uh metrics to be able able to say how wrong the AI was in a way that helps it learn well.
That's at the training side.
It turns out the embeddings, you're often trying to use embeddings for a different purpose.
For example, you might want to fill in the missing
image, the missing part of the image when you're training. But then when you're actually using the
model, you want to group all of your images together. So someone can say, hey, this is a
picture of me at the beach, find more like this one, which is different than when you're training it so we
call this training serving skew it's when you use a model for a different use than it's it's uh what
it was trained how it was trained um and so what that means is the similarity metric on the embedding
like there's a lot of unknowns there like you know like it might be that the model was just
trained to fill in parts of the image and so it actually does a terrible job of of finding similar
images it might just be a coincidence right that it does a good job right so so there's not a lot
of theory at this point and when there's not a lot theory, the best thing you can do is try a lot of things. And so, you want to have a variety of different similarity metrics on the embeddings.
So, a common one is just how close something is like in Euclidean space. But then there are a
bunch of others too. We don't need to dive into all of them, but there's a ton of different similarity metrics.
And so, you know, once you've chosen a similarity metric, then you can say,
given a point in this space where the point can have a document on it or not,
find the nearest neighbors, find the nearest documents to that point um and uh um and then you can look at those results and
see okay does that match what i was hoping from a product perspective
so the point that you're making and getting neighbors like a query but the query is not
uh doesn't have to be a text like you said could be a picture it could be whatever it gets the embedding and then the goal well maybe
i'm jumping jumping the gun here but the goal of the vector database is to help you efficiently
find the things closest to your query yeah yeah exactly right yeah so um just like we talked about
with the fourier transform as pat Patrick said you know you'd need an
infinite number of waves to get you know a perfect resolution but you're often willing to tolerate
you know error um similarly you know if you're willing to tolerate error in finding out which documents are the closest then you can do what's called
sublinear approximate nearest neighbors so the idea is if i have a million documents
i don't need to check the embedding of all million of them to get the nearest neighbors to a point. I can use a variety of different data structures to say,
okay, I'm extremely confident that I have the nearest neighbors, but I'm not 100% confident.
So for example, maybe there's a partitioning system. And if you fall like exactly just to
the left or right of a partition boundary you might miss what's on
the other side of the partition um and that might happen one out of every 10 million times and only
affect one result and so you're you're more than happy taking that penalty in exchange for just
massive massive speed up so imagine like uh we're talking like going from like a million
seconds to search to six you know or something like that so um so that's basically what the
vector database does so the vector database has it so where you provide your own embeddings it
doesn't do that part but you jam a bunch of embeddings into this
vector database, it creates all of these data structures, and then you can give it vector
queries, and it will tell you the nearest neighbors. And the vector database is responsible
for managing sort of the dynamics there. So for for example if you start removing documents and
adding different ones at some point the vector database might be like the data structures might
be kind of stale so i'll give you an example like an extreme example let's say you add a ton of vectors, but only the first dimension isn't zero.
So the vector database is basically going to create a bunch of partitions around that first
dimension and ignore the rest, right? If you take those documents out and insert a bunch of documents
where only the second dimension isn't zero, but you still use the old partitioning scheme they're all going to fall in the same
bucket and now you don't have that speed up anymore and so the vector database is responsible
for like knowing when to rebalance and re-index just like your hard drive has to occasionally
rebalance the b-trees on the on the folders of your hard drive what's old is new again yeah exactly um so um yeah and
then beyond that just all of the traditional database things you know backups and restores
um all of all of that stuff that we've come to appreciate from databases uh you know vector
databases have to provide that as well so if you have like you
mentioned so you like euclidean distance or you hear like cosine similarity or whatever if you
want to change the metric do you need or does per metric is there needed a different clustering
or is it like one is good for all of them yeah it's a good question. You basically need a different... Well, the answer is always
going to be it depends on the data, right? But definitely, you can construct datasets where
each clustering, each similarity metric needs its own clustering and you can show that you know uh
that there is a data set where you know any clustering can only be good in one of these
three different similarity metrics so it's possible now in practice in practice i wouldn't
be surprised if these database systems just use euclidean distance for the clustering and then
you pay a slight penalty if you're using something else.
I think for 99% of cases, that would be fine.
Makes sense.
Yeah, and so this is where,
this is definitely in the category
kind of like authentication.
It's in the category of things
you don't want to write yourself.
Oh, come on.
It's going to be pretty gnarly.
You know, even some of the top vector databases are only now starting to become mature.
So, for example, you know, I've used in the past Milvus, which is an open source vector database and um we had issues where when we tried to back up the
database it would crash and we would lose everything um and so um you know definitely
i would say it's still at the phase where you probably want to store your embeddings
just in a regular database as well um it's still kind of early. But you know, every month, it gets
massively more mature because so many people are using it. The one that I so Milvus is good.
pinecone is a enterprise alternative pinecone is a lot better, but you're going to pay for it.
You get what you pay for there. Another one that I really appreciate is PG vector,
which is an extension to Postgres
that lets you have vector columns in an existing table.
And that's really nice, right?
Because you could have, you know, like a row,
have like, imagine you're storing houses um your zillow or something right
so you could have a row and the row has like an id the address of the house uh the number of
bedrooms like all of this explicit information and then in that same table you just add a column
that's like you know embedding and that column is backed by pg vector
and it works pretty naturally so you could um this is actually another thing you know
the approximate nearest neighbors it assumes that they're all valid right but like uh for example let's say you only want three bedroom
houses that are approximate nearest neighbors right that's actually a really hard thing to do
because you might pull like the nearest thousand neighbors but they're all two bedroom i have to
pull another thousand another thousand another thousand until finally you have enough to fill up the three bedroom limit that you set so um there's a lot of complexity there um but uh but yeah you know
these these database systems are really good at at handling all of that for you
yeah i i would love to like find an excuse use this. It always sounds really interesting and cool.
And I know I'm obsessed with tinkering.
So I'd love to find a use, but just haven't had one yet.
Yeah, I mean, for me, the one use case that I've found outside of work
has been nearest neighbors on my photos.
It's not too hard to use one of these open source models,
embed all of your photos,
and then given one of your family photos,
you find all the nearest ones.
Google Photos, these other things will do it for you.
So it's really more of a fun exercise for you to do
than something that's providing a lot of unique utility.
But that's been a fun thing that I've been able to do that's something that's you know providing a lot of unique utility but um that's been a fun thing that uh that i've been able to do with it
cool all right um that is a ramp up on vector databases if you're doing anything with vector
databases let us know give us a shout out go on the discord uh the discord is starting to pick up
some steam a lot of really interesting discussions about career uh changing jobs should i be a
consultant after i leave my company a lot of really interesting discussions there so check it out
uh okay i will i will i will take that as a personal message
well i got you covered, Patrick.
I will be there, but it'll be a bot.
That's right.
Patrick's AI will be there and you can interact with it.
This is awesome.
I learned a lot.
So thanks, Jason.
And thanks, everyone, for listening.
Cool.
All right, everyone.
Thanks so much for supporting us on Patreon.
We really appreciate it.
And we will catch you all later.
Have a good one. and share alike in kind