Programming Throwdown - 177: Vector Databases

Episode Date: November 4, 2024

Intro topic:  Buying a CarNews/Links:Cognitive Load is what Mattershttps://github.com/zakirullin/cognitive-loadDiffusion models are Real-Time Game Engineshttps://gamengen.github.io/Your Comp...any Needs Junior Devshttps://softwaredoug.com/blog/2024/09/07/your-team-needs-juniorsSeamless Streaming / Fish Speech / LLaMA OmniSeamless: https://huggingface.co/facebook/seamless-streamingFish: https://github.com/fishaudio/fish-speech LLaMA Omni: https://github.com/ictnlp/LLaMA-Omni Book of the ShowPatrick: Thought Emporium Youtubehttps://youtu.be/8X1_HEJk2Hw?si=T8EaHul-QMahyUvQJason: Novel Mindshttps://www.novelminds.ai/Patreon Plug https://www.patreon.com/programmingthrowdown?ty=hTool of the ShowPatrick: Escape Simulatorhttps://pinestudio.com/games/escape-simulator/Jason: Cursor IDEhttps://www.cursor.com/Topic: Vector Databases (~54 min)How computers represent data traditionallyASCII valuesRGB valuesHow traditional compression worksHuffman encoding (tree structure)Lossy example: Fourier Transform & store coefficientsHow embeddings are computedPairwise (contrastive) methodsForward models (self-supervised)Similarity metricsApproximate Nearest Neighbors (ANN)Sub-Linear ANNClusteringSpace Partitioning (e.g. K-D Trees)What a vector database doesPerform nearest-neighbors with many different similarity metricsStore the vectors and the data structures to support sub-linear ANNHandle updates, deletes, rebalancing/reclustering, backups/restoresExamplespgvector: a vector-database plugin for postgresWeaviate, Pinecone Milvus ★ Support this podcast on Patreon ★

Transcript
Discussion (0)
Starting point is 00:00:00 programming throwdown episode 177 vector databases take it away patrick welcome everyone to another episode we have a great topic today. I'm excited to learn. It's legitimately something that I hear all about, but I don't know too much about. So we're going to have Teacher Jason joining us here in a minute. That's right. We had Teacher Patrick for compilers and interpreters. And then we have now Teacher Jason for vector DBs.
Starting point is 00:00:41 I feel like we should have called ourselves professors. We missed opportunity there. Oh, that's right. What's, is there anything higher? Uh, distinguished, uh, a Mertis professor,
Starting point is 00:00:49 a Mertis, a Merti. I, I, okay. What's the highest level professor? Uh, you know,
Starting point is 00:00:56 can you, can you, can you, can you, can you, can you, can you, can you,
Starting point is 00:00:58 can you, can you, can you, can you, can you, can you, can you, can you,
Starting point is 00:00:59 can you, can you, can you, can you, soul emperor of the knowledge universe. Oh, but what if it thinks it's the smartest it's gonna tell you a lie because like it doesn't want you to be superior okay anyway no sorry oh you know i have this plan now when um you know when i get these calls from people telling me that like my my taxes are overdue or i should buy car insurance or whatever what i've started doing is asking it questions about particle physics.
Starting point is 00:01:28 And when it gives me really good answers, I'm like, aha, this is chat GPT. This is not even a real person. A particle physicist could work in a call center. Okay. Yeah.
Starting point is 00:01:41 I mean, I had to go looking for a car and they legitimately like you know want your phone number whatever so i just have one of the like you know uh i guess like it's a voice over ip number but it doesn't actually like ring through whatever and the place one of the places i went has called me twice a day every day for going on two weeks now and i've picked up zero of the calls and i'm just like they really want to sell me a car but it turns me off i'm just like why are you pestering me this much like i feel like just text me i will so you know my dad worked for uh a few different car dealerships over the span of about 30 years
Starting point is 00:02:18 and so i learned the trick to buying a car and i'll share it with everybody if you're a programming throwdown listener for free you don't have to be a patron although we really appreciate it um if you want to buy a car here's what you do you show up you test drive the car in person so they know you're a real person you're not another dealer trying to buy a car and then resell it you know you're there as an individual buy car you prove it to them right uh and then you leave and you don't come back until you have everything finalized so everything is over email or phone you negotiate the price you compare with other dealers and uh you should expect this to take about a month um but if you do that you will save anywhere from like 10 to 30 percent of the price wow i last time i bought my car i felt like it was pretty different than now and that like the
Starting point is 00:03:12 inventories are much worse um but yeah in general i did something similar to what you were saying for the last time about a car it was super nice i just showed up like all the paperwork was done you know just basically it still took you know close to an hour which is ridiculous um but you know pretty much it was all ready to go so yeah that i i agree i'm going to attempt to do this one again i haven't gotten to the negotiating stage yet so but showing up i guess they're trying to you know do the cap show with me so they're trying to call me back and make sure the phone number is real yeah i haven't bought a car since uh 2020 and so i've i've missed this shortage but is the shortage still a thing or is it over i don't think it's a
Starting point is 00:03:53 thing in the same way but the inventories on like desirable stuff is still not great and um at least at the places i was like i think it varies dealer to dealer how much they've you know or brand to brand i guess how much they've recovered um so uh yeah i think it's it's still on the pricing is is tighter but i i don't know i will still die on the hill that i think we are stupid for having gotten ourselves and i guess this is us-centric people tell me in other countries it's not like this but i have no idea why you go in and the only place in life remaining is a car dealership where you have to like negotiate. I'm going to bring in my manager. I'm going to bring in the finance.
Starting point is 00:04:33 We're going to change everything on you. Like it's a total like scammy thing the whole way through. Oh, yeah. That's another thing. So the thing about finance, you ever play these mobile games where there's like 24 different currencies you know like i play uh fifa on mobile and there's like there's literally five five to ten different currencies at any given time and some of them get phased out and replaced with other currencies and all that um but there's one currency that like you could pay real money to get
Starting point is 00:05:04 and that's ultimately the only currency that really matters because that's one currency that like you could pay real money to get. And that's ultimately the only currency that really matters because that's the one that's tied to USD. Right. And so if you start going down the finance route early, you end up in this boat where they're like, oh, your monthly payments are the same. But like they really they added a year of payments, you know, but each month is the same. And there's just there's just like too many variables and it's like pretty easy to get trick um and so that's that's another thing that my dad told me was like always like negotiate the absolute dollar amount and then come in at the very end and do financing after like you're ready to pay the amount whatever so this has turned into a
Starting point is 00:05:46 little bit of advice from jason and a lot of whining from patrick uh but yeah buying a car i look forward to i will say from what i understand tesla does it a little different i look forward to you know some future for my children or whatever where like you literally just figure it out online maybe you show up to test drive once and then you just buy and it doesn't matter if you buy from the one on the west side of town or east side of town it's all the same like yeah every other thing we go to and do so yeah exactly can't come soon enough all right and i'm sorry for all the car dealership people that i probably you know very frustrating to or putting out of business. But you know, yeah, if you're a car dealer, message us, we'll take we'll take your hate and read it on the air for everyone to hear. Oh, no, no. All right, we're jumping into news.
Starting point is 00:06:36 All right. So my first article talking about programming things here, cognitive load is what matters. So this is actually a GitHub article, which is a kind of interesting way to do it. So the person can update it over time. And this is something that I picked, I will say a little selfishly because I say this all the time at work and there'll be a link in the show notes. But just talking about in development,
Starting point is 00:07:00 that one of the most, in software development, one of the most overlooked things is how much cognitive load you're putting on the engineers and this can come in a couple forms and they kind of call them out like sometimes that could be the business surrounding the code um i work at a place that does like lots of software engineering so it's it's a pretty like respected part of the business but a lot of businesses you may only have one, two, three, four software engineers building something. And so the bulk of the business isn't really the development or the software. And so there's all sorts of like, hey, you really need to work with
Starting point is 00:07:34 the accounting team and this team and that team and who talks to whom and this is using this database and this person is using this other one. All of that has to be loaded into your head. The other thing I always bring up that people forget is you go to college and you learn all this stuff about compilers and programming and you show up on the job and they give you like an unconfigured Linux box. And maybe you've only ever used Windows or you've only ever used Mac OS. And now it's like, hey, here's this Linux box. Congratulations.
Starting point is 00:08:00 Like configure it, install your packages. Oh, you need to like, you you know move your pc to a different location and plug in all the cables but wait a minute like is that really like a requirement listed for the job that like you know bash or zsh or any of these shell scripting and so the amount of cognitive load that goes into um programming and and then you know, I think kind of the bulk of the article is when you're selecting paradigms, abstractions, where to divide stuff, how much to copy and paste,
Starting point is 00:08:33 there's a balance to be reached. If you make, you know, stuff too small and modular or too monolithic, both introduce a lot of cognitive overload. So there's the balance there but actively thinking about how do we build this in a way that makes it easy to reason about so things that are similar are kept together um and not abstractions for abstraction's sake is an important thing the one that always gets up like we program in c++ and it's like everybody
Starting point is 00:09:03 aha look i wrote the ternary operator which is you write a you know conditional expression like an if statement you know a greater than b and then you do a question mark and then the first thing is what happens if it's true and then you do a colon and then you put the second thing okay just write an if else statement or an if then like like you are i have seen i've been programming for a long time i see yes i can parse it but i don't parse it as fast and i always have to double check like wait a minute it's the true is the first one false is the second one and secondly when we bring on a new person maybe you say hey we're all
Starting point is 00:09:36 experienced people uh and this comes up in a later news article but when new people come on how like maybe they aren't used to that so using and this is matters for some languages more than other. But using like the subset of the language that is most comprehensible and most widely used gives you benefits and that it's very easy to bring new people on. It's very easy to teach. And we were working with a code base where they like put in a ton of macros. All of the, you know, variable names are single line, single character, variable names, all this stuff. And they may be efficient in programming,
Starting point is 00:10:11 but you cannot work with them, you can't work with that code base. And all this is about that, you know, thinking about cognitive load, and really programming in a way that reduces it. Yeah, I've seen so much, like, kind of bad behavior with macros and metaprogramming and C++. It just makes it so difficult. It's like, oh, you know, I'm using like the double hashtag thing to like concaten this this call to a preprocessor macro to create a class and i'm doing that call like 30 times and so none of these classes actually exist anywhere so like you know it messes up the ide and everything and and you know at some point it's like look if you were to copy paste this class 30 times well first of all there should be a better way of doing it but let's say you just couldn't figure it out just copy paste the class 30 times at least that way someone can see kind of the point and the pattern and all that if you if you hide it behind a
Starting point is 00:11:18 reprocessor macro it just becomes like a tar pit that people get stuck in and i think there are other languages that the same thing happens or even you know the way of doing object modeling and i've seen stuff in java with you know getting into dependency injection and reflection at runtime and just like it makes it really hard to like hey i have a call site here what code is being executed like how many hoops do you have to jump through if you don't know off the top of your head to find the actual line of code invoked?
Starting point is 00:11:49 And how confident are you that you found the actual line that is being invoked? And if that is not a trivial amount of time, you should be really, really thoughtful about trying to reduce that
Starting point is 00:12:01 as much as you can. Yep. Yep. Yeah, totally. Yeah. Dependency injection is uh in java i remember um i forgot what the tool was called but there was this tool where like you would maybe it was part of the spring framework but you would pass in um uh like a config and it would create classes from this config and feed them into your main function. And, and so it's like, oh, like, this object is already filled in. What the heck,
Starting point is 00:12:33 like, how do I change it? Oh, I it's actually in an XML file. It's like, what, you know, it just became really hard to use. Moving on to diffusion models are real-time game engines um i i don't know i don't necessarily agree with the premise of this but the thing about this that blew my mind is the video so basically what people did is they took these like image generators and text image you know that whole like a vision transformer kind of technology and they basically used it to say if i have this image of a screenshot of a video game in this case they use doom because it's open source and people use it on everything so given this screenshot of a Doom game, or maybe even like a handful of screenshots
Starting point is 00:13:29 of the last second of this Doom game, and given a command like shoot a gun or go forward or whatever, then generate the next screenshot of the game. And so they basically are trying to encode the entire doom game like the art assets the engine everything into a diffusion model um and so then they proceed to play this game and it just gets it just gets kind of weird like uh you know points of discontinuity are really difficult for these models like so for example um when you fire the weapon sometimes it doesn't fire um you get the fire animation but like you know that the world isn't affected um and other times it is um sometimes like you walk
Starting point is 00:14:20 and a door is just kind of like flickering in and out of existence um the whole thing is wild uh uh you really uh we're not doing it justice here over audio uh but go to the show notes and watch the video um it's just pretty uh i've seen a lot of like weird things with doom like they passed it through daydream and you have this like really weird uh uh you know kind of like trippy kind of uh uh you know view of it um but this is really taking it to the next level like you're defying the the laws of the game are not are not are not first order anymore so it's like sometimes the world works the way you expect sometimes it doesn't and it's just kind of it's like sometimes the world works the way you expect. Sometimes it doesn't. And it's just kind of insane to watch. There's a couple games that kind of do something that it reminds me of.
Starting point is 00:15:13 So I actually have seen this one before. And like you're saying, it's kind of weird because it looks like you're playing the game 90 percent of the time or whatever but then occasionally something happens uh that is different so like you're looking one way you turn around and face the other way and then when you turn back the world is completely different like you're not in the same same place or like the world has shifted because it's just it's generative so it's just doing the whole thing sort of like on the fly so you almost convince yourself it's right and there's a game i think it's super liminal i don't know if you've played this that has stuff like this like you you can go
Starting point is 00:15:55 through a door and then when you come back through the door like you're not in the same place it's sort of i don't call like non-euclidean or whatever. And it kind of plays with the sort of perspective. And so if something changes size in your perspective, it changes size in real life as well, in the game as well. And so there's all these things you can do where it sort of subverts your expectation of a game modeling reality. And it doesn't mean it's not fun to play.
Starting point is 00:16:23 It's just, it can be very fun i don't know in these cases maybe it probably wouldn't be very fun um but it it is just like this weird twist to your brain yeah totally i'm pretty sure you can't get much beyond the first or second level in this i mean this is like some academic exercise um but uh uh but yeah i just think there's something there's something just kind of wild i mean it's it's uh definitely something i've never seen before where someone tries to like imitate an entire engine of a game with a neural network and then see what happens people with too much time on their head no i'm just well you know people with too much time on their hands? No, I'm just kidding. Well, you know, people with too much money in their hands, right? Because of the AI hype cycle.
Starting point is 00:17:08 Oh, there we go. Yeah, I mean, maybe that should be episode 178. So we'll get sent some our way. We'll try some of this stuff. Yeah, that's right. Yeah, we're currently raising, we have a new model. It's called CLAWED, C-l-a-w-e-d it absolutely does not just call claude under the hood please give us 50 million dollars oh i yeah i i've seen some of
Starting point is 00:17:33 these accusations going around about people just calling other models and saying yeah okay anyway we're moving on uh this is the uh the ai gossip i guess yeah that's right my next article is a kind of related to the previous one your company needs junior developers uh and so again i guess a bit by is this something i i push for all the time i work it's a strongly held personal belief and uh and so um the idea here a lot of places i think you can fall into the trap of hey we want to hire a senior person the you know inside joke i guess it goes around is like you read resumes and it's like hey you need 10 years of experience with you know a tool that came out two years ago and it's like everybody needs to hire the utmost expert in the field and in practice that's not really true most of the time
Starting point is 00:18:19 that isn't it's not to say that there's never a good idea to hire senior people but i feel by default most folks have this belief that like hey hey, we're high velocity, the way to like, keep this up is just keep adding people at the same level of the team or above the level of the team to bring them up. And in some cases, that may be true. But I have found that often, there's a number of first and second order effects that come from hiring junior developers you know and i'm not even talking about like budget you know money reasons i'll leave all that aside hiring junior people and being able to train them like helps them sort of uh they they're more uh plastic towards you know how stuff is being done and learning if, you know, they're inquisitive.
Starting point is 00:19:06 And it's not to say that they won't express their own opinions or bring changes. They often will and sort of say, hey, I learned it this way or why not this? And they ask a lot of questions and through those questions, I think can come learning. But they don't come in with as many preconceived notions of this is the only way to do things, or every time I've done it is this way, or hey, we first need to reconform everything you've done to my way of doing it. And then we'll start making progress. But also putting the, you know, more senior developers into the mode of teaching and by teaching, they're learning, they're having to synthesize concepts and simplify stuff, like we were talking about before cognitive burden if you're the only one coding maybe it doesn't matter so much because the context is always loaded in your head but the
Starting point is 00:19:49 minute you have to share that context with someone else who's also making changes right now you start to like help each other and improve and um also i think making sure that the expectation is known to senior engineers that there will be junior engineers and that they will bear some of the responsibility for helping to onboard teach and you know show them not that they have to become managers but that everyone on the team has a responsibility to help you know those who are trying to learn or asking questions and i think that again these first and second order that you have someone who is contributing who can grow and can like you know sort of become the the team member that you need also gives like a sense of a narrative arc
Starting point is 00:20:31 to the team right like that we were here and now we're there and like look at the growth and so for just a host of reasons i think it's a really important thing to be hiring junior engineers to bring them on to have them grow and learn. And then also sometimes projects go through phases and the project of your phase may be, hey, it's not insane growth right now. We're not able to hire a ton of people. Because of that, some senior engineers may need to or want to move on to other companies or other projects or other things because there isn't a lot left for them to sort of grow, but there's still a lot to be done. And this is when junior engineers can come in. And for them, that work is super suitable. Whereas for a senior engineer, it may be difficult to justify, you know, the continued sort of like, you know Patrick and I are really invested in growing engineers. I mean, the whole point of this show when we started it so many years ago was that and sort of the future of the field. I think that engineers have been hit with some really big whammies. The first thing they've been hit by is just exploding college tuition costs, right?
Starting point is 00:21:57 Then you get hit with the pandemic and all of your mentors being, you know, not being around. And now there's all the layoffs and companies saying, I was in a meeting. There was a Austin like engineering leadership meeting that I was at. And this person was saying that, like, hey, you know, we just need to get rid of like 80% of our company like Twitter did and just keep like the 20% super productive people. And the thing is, that will work in the short term for your company. But there's a tragedy of the commons there where like you're kind of like, it's like when you buy roses from HEB or from your grocery store and they look great, but they don't have roots. And so, you know, that these roses will not live very long. They're already dead. Yeah, that's right. Like, like the cat is dead in the box. And so,
Starting point is 00:22:58 and so like, yeah, you're without roots, like it's just not, it's just a temporary facade. And so without roots, it's just a temporary facade. And so I do worry a lot about junior devs just not having a safe place to land. What do you think is going to happen, Patrick? Do you think that if people keep focusing on senior devs, do you think that people are going to get dissatisfied? There's going to be a generational gap or uh what i mean leaving aside uh fundamental pivotal shifts from you know ai or whatever um i think it's a slightly different topic i think this is where uh and and this will get somewhat i guess a little maybe controversial but i think this is one of the
Starting point is 00:23:45 benefits of having capitalist markets is like i think companies that will hire junior devs and invest in training them will have a market advantage and so you know two companies participating in the same so that company that you mentioned just as an example like i want to lay off all these junior people they're not really pulling their own weight right now. We overhired our fault, whatever. Those people are going to go somewhere. And if not, they might self-organize and kind of learn their own thing and build up a competitor or do something better because they kind of see what wasn't working at the previous company. And so this is where I'm a believer that, you know, having the competition, companies that do this will end up winning out over companies that don't. Companies
Starting point is 00:24:29 are willing to teach and hire. Again, like I said, I think they have better practices. And that's not to say in some extreme example that, like you said, somebody could do this and in the short term, it works amazing. But then, and you know, hopefully it doesn't happen. But what if you have a team of super high functioning, three people, they're working at 110% capacity. it doesn't happen but what if you have a team of super high functioning three people they're working at 110 capacity that doesn't make sense i know but they're working 100 capacity super optimal you think you're a genius and then one of them uh i mean i work with people we're getting not that old but up in age have a heart attack have like you know family emergency and they literally stop working for six months and there's no redundancy there's no backup like is your company going to like close close shop
Starting point is 00:25:12 because you were that's kind of i guess the equivalent in financing is people do leverage they use all this leverage and it works great until it doesn't and so you seem really genius i guess it's equivalent what you're saying about the roses already cut. Yeah, it's really genius. You're making all this money on leverage. But as soon as, you know, the tide turns even a little bit, you crash really fast because everything is magnified. Right, right. And even if like you might say, well, I will let other companies hire and and skill up junior engineers and then i will just hire those people when they become senior engineers but i think the problem with that is you need to transfer
Starting point is 00:25:52 knowledge within your company you know i don't think it's not as fungible as people would hope and so uh and so that i don't think that strategy will work either yeah i mean you still have to teach them this specific most companies still have quite specific things so even if you say i will let other people train them and then we hire them you don't have a culture of training people because you don't do it very obviously but hire them back and then you're going to have a difficulty getting them to adapt and so you're going to go through a lot more pain and suffering yep yeah exactly um all right so um oh on to mine so mine are a couple of libraries that let you do um real-time text-to-speech which is now becoming basically an open source commodity which is kind
Starting point is 00:26:42 of amazing so so you can actually um and i'll add a third one to it that just came out uh this morning but um you can actually have a conversation with an ai um an open source ai in real time uh and uh that just blows my mind um like you can you can uh talk to it it'll talk back you can interrupt it and say hey i want to talk about something else like all these things that we saw in that um open ai demo about maybe six months ago a year ago are now becoming a commodity um and uh here's here's a few examples of libraries that do this um but yeah, I think the progress is just absolutely astounding. I mean, it's actually really hard to keep up with the technology.
Starting point is 00:27:32 It's just moving so fast. But this is a really big milestone, and a few different libraries have kind of reached it in more or less the same time. With the speed of some of this and like you were joking about earlier but i kind of find myself thinking about it sometimes companies there's a lot of untapped potential and sort of developing companies around some of these things and just you know replacing uh stuff within bounds but i think defining those bounds is really important like you know making sure that there are some i don't call them guardrails or whatever around how some of this stuff is done
Starting point is 00:28:09 but also it's got to be somewhat difficult to you know say you're going to build and do an integration with one of these you know models that came out and then to just like three or four months in potentially a completely different way of doing or a different architecture just moves the goal posts and then you got to decide like do we ship on this old thing or do we like completely redo and move to the new thing and then you end up in this constant cycle of like uh you know oh man i just got completely you know blown away by whatever the next release from whomever is yeah right i mean you know a good example of this is like the text to image stuff where you know wasn't really that commercially practical because the images were just not that good i mean they
Starting point is 00:28:52 were fun to look at but like you'd obviously know that this was computer generated and and it basically you know within the span of 12 maybe 18 months it went from from that to like you literally like maybe okay so i take that experts can tell but like you and i can't tell if you see a photo of a person like a profile photo we can't tell if that's a real person or not and and that is just wild i mean if your product um was was kind of based on one thing like you it just opens up a ton of product opportunities to your point yeah there was all that stuff like oh they're not good at text earrings won't be symmetrical you know too many fingers and you're right they were kind of funny to look at and they looked plausible and then now you're right i see pictures posted
Starting point is 00:29:40 in some of the sites like oh look at this ai trash and i'm like i i it you know if i really stare at it it looks off but there's a lot of commercial photography that's airbrushed or whatever that also can look off or not natural so i think it's like you said it's a sort of uh you know whatever maybe some experts saying these aren't good but they actually are very convincing a lot of times um and it's really only the fact that the situation is so improbable like oh a cell phone snapshot of you know an elephant parachuting into my backyard like okay probably that didn't actually happen so it must be fake but not it's not actually something in the scene that tells you that's not realistic right right and uh the thing too is yeah i mean if if if you
Starting point is 00:30:29 if you're in the mindset of i wonder if this is real or not because someone told me it's fake well then you're gonna have a very discerning eye right but if you're just on the internet browsing you're not you're not you know you're not in that type two mode of thinking, you're just consuming content. And, and I think you will just, I think many people will just look at fake images and just think they're real. Uh, like all these stock images, you go to like some random startup that just paid like someone on Fiverr to build a website. Right. And there's like stock images of people just having fun around a computer i bet if if you replace those with fake images most people wouldn't
Starting point is 00:31:10 know yeah and there are some ethics problems i will say as well looking at everything with the discerning eye causes his own problem because there's plenty of actual images taken that look implausible or seem unrealistic as well and there are totally legit for variety of things so you could enter conspiratorial whatever like yeah i saw a video of this bad thing happening but it's it's just fake like it doesn't look real there's no way that really happened oh interesting yeah i mean that's a that's a different direction i hadn't thought about is if there's a real event, like we're at 912 today. Yesterday was 911.
Starting point is 00:31:49 So like, yeah, I mean, if I'm reading Twitter and I see like some big terrorist attack, my first reaction might be like, that's a fake picture or something. You know, it's kind of wild. All right. So moving on to book of the show. Patrick, what is your book of the show? Book of the show is not a book. Cheater. Warning, tool of the show is not a tool either. So yeah, we got to rename these sections, man. All right. I thought I must have talked about this before, but I Googled through the show notes and I didn't see it. So if I have, I apologize.
Starting point is 00:32:27 I have never heard of this, so I think you're good. So this is one of those, I know it's like fairly popular now, but I run across fairly popular, like million subscriber YouTube channels all the time that I've never heard about before. It's just a million isn't what it used to be or something. I sound like an old man now. All right, anyways, so Thought Empporium is a guy who does i guess like just call like hobbyist science research my hero like i wish i wish i could do this stuff like i wish i could
Starting point is 00:32:55 make money just poking around like random stuff in my lab and like i want to say close to probably half of the stuff he does like i have at one time or another done played with or sort of so it's very fascinating there's a bunch of uh sort of one-off projects um and long-term projects that he's working on including topics like genetic engineering um he was recently trying to grow neurons to play doom so how do you like grow neurons in a petri dish hook them up to electrodes hook that up to a computer and make them play doom and not your garage that's not exactly right because this garage is kind of fancy but not in a multi-million dollar biomedical research lab either the one i linked is just him and his friends like sometimes come up with memes i think they released one yesterday i was like you know everybody always jokes what did you take this photo with a potato um just him and his friends like sometimes come up with memes i think they released one yesterday
Starting point is 00:33:45 it was like you know everybody always jokes what did you take this photo with a potato and so he actually built cameras out of potatoes um not like a potato hooked up to a camera like the whole camera is a potato um and so like the potato is a pinhole uh and you use the flesh to sort of get a negative image okay anyways the one i linked to in the show notes you can check it out is a single use emergency thermite hot dog cooker so what would you need it to do if you needed to cook exactly one hot dog like in an emergency situation in a single use container and it's just like him figuring out like in a moderately serious engineering like how would i what is the best way to do this like how would i use heat to cook the hotdog should i cook it like by itself should i cook it in a liquid
Starting point is 00:34:39 you know is this actually going to work how safe is this you know could i make the could you make a product out of it and so if you've never checked them out uh definitely do there was a lot of biohacking initially like you know i'm going to inject the genetically modifying materials into myself and do crazy stuff um nowadays a little less but another one that my kids really liked was, is mummy yummy? So let's go through the full process of making like a mummified chicken from the store, grocery store, like legit. We're going to go into reading hieroglyphics and making our own hieroglyphic statements. Do the full mummification process with authentic old recipes and then taste various parts of the mummification
Starting point is 00:35:27 fluids and things and determine if mummy is yummy is that healthy i mean or is that sanitary so nowadays it would be formaldehyde and you definitely eat formaldehyde but it turns out in ancient egypt they didn't have formaldehyde so you actually could use most of the stuff so i won't spoil but if you think it is intriguing to find out if mummy is yummy you're in the right place okay yeah this is awesome i think i think uh this is one of these things where you know 40 hours of my life later i'm like how did i get on this youtube channel uh mine's the opposite so i watch this and then i find myself on ebay like how could i replicate what this guy is doing like i have so many questions yeah the youtube to amazon pipeline is strong um my book of the show is a project that i started with a friend of mine um called
Starting point is 00:36:23 novel minds and the idea was there's all these public domain books. A lot of them are required reading for kids. Even the ones that aren't, a lot of them are really good, like War of the Worlds, like classics, right? So I thought, could I use AI to turn them into sort of, I think the word is graphic novels. I mean, not, not literally a comic book, but just a picture book. So there's just a lot of pictures, you know, the cover is
Starting point is 00:36:50 totally redone. And so we did this. So, um, by the time folks listen to this show, there'll be 30 books on this site. Uh, I think at the time of the recording, there's maybe seven or so. But you can basically grab them all for free. If you're on Apple, it will just open in Apple Books naturally. If you're on Android, you'll have to get an EPUB reader, but there's plenty of them for free. And you can read all these classic books with tons of pictures. We tried to do roughly a picture every like 400 words or so. And there's a bunch of technology. I actually gave a talk on how this works that's recorded. And I posted the talk on LinkedIn if folks want to watch that for more technical depth.
Starting point is 00:37:42 But if you have kids who are in school who are have to read these books, you could give your kids or your friends, your friends, kids, copies of these versions, all the text is exactly the same. We we haven't changed any of the words. We just added a ton of pictures. I did not know about this until right now. So I'm actually scrolling through one at present. This is really cool. The style is very consistent. The main character doesn't seem to be, I'm looking at around the world in 80 days,
Starting point is 00:38:10 but the main character doesn't seem to be super consistent between them. But there used to be books when I was younger, they were like, there's like every other page, there was like a hand-drawn illustration. And then they were the like synopsis. Like they were kind of, I don't know,
Starting point is 00:38:24 10% of the full book or whatever. they were kind of i don't know 10 of the full book or whatever they were like concise oh yeah yeah so okay a couple of things so one uh the version that patrick's you're looking at right now is is an older version oh okay we did fix the consistency um in the newer version uh we basically um used something like a vector database. Foreshadowing. But we kind of encoded a representation of each of the characters. And then when a character is referenced, we added that representation back in. And so now the version that will be out when we publish the show actually is pretty
Starting point is 00:39:06 consistent um but yeah you're right the the first the very first version um was like thematically completely inconsistent like one picture would be black and white the next picture would look like a cartoon next picture would be like a hyper real, you know, and so getting consistency. The second phase was where we got a pretty consistent art style, but the era was all over the place. So like, you know, it'd be 80 days around the world, which was record, which was set around the 1890s, I think. But there would be like a high speed rail picture or a guy with like a rocket launcher you know and so what we did there was uh we also injected the era we figured out the era by scanning the book and then we inject that into every single image prompt um so yeah it's been a really really fun exercise um my goal is ultimately to get these into the kindle store although it's pretty difficult because there's a lot of scammers
Starting point is 00:40:14 who try to push either fake ai books or public domain books onto the store and so we're running over a couple of barriers just trying to prove to amazon that we're legit um but uh yeah that's it's a fun little book project and i've had an awesome time actually reading these books some of them are a little difficult to read because they're rather old uh but uh a lot of them are awesome this is cool i'm trying to determine how many of these i've read before so can i have another request can you like run the generator to give me summaries so it's like all the pictures but only one percent of the words so that i could uh claim to have read all of these books much more
Starting point is 00:40:57 quickly than actually read all of these that's an interesting idea you know i wonder you know the reason why uh the reason why i didn't change the words is i felt like if it is required reading through someone no yeah there's a test or something bad idea but yeah but but you're right i mean for people who aren't reading it for a grade or anything um we could definitely like summarize every paragraph or do something like that there was a so what i want okay so this is really dumb okay we can move on but uh growing up there was this show wishbone where it was a little dog do you did you watch this show no i never heard of it okay oh my gosh all right this is good all right i'll make it short so just this dog who goes to same same idea like various famous you know plays or books and the the dog is like the you know he goes into the story and then he you know it's like a dog is the main character or whatever in the book
Starting point is 00:41:51 but they're just acting out the book in a 30-minute show and so you get the arc of you know the book the main idea the gist in 30 minutes for kids just like teach them about different worlds and different you know a moral story or whatever but i okay that won't make sense if you know but that's what i want is like the the wishbone like the hey i could hold a conversation with someone and unless we were doing like a book club they would believe that i had read the book and i will have benefited from the you know analogy or the you know whatever the you know book is trying to get without all of the dialogue that happens between you know characters that like you said if it's required reading it may be very important to read i'm not trying to denigrate old books but i also know that i won't commit that
Starting point is 00:42:36 time yeah it's a great idea um maybe by the time you publish a show i'll be able to do that i mean one simple thing would be you know take each chunk which is around i think it's 4 000 characters take each chunk uh that's being used to make an image and summarize that chunk down to like 200 characters all right i'm excited this is cool i don't know how you're going to become rich and and uh become the financier of the uh the biomedical research lab but uh we'll work on that i'm talking to was it tony stark all right time for tool of the show yeah if you want to help us oh sorry go next
Starting point is 00:43:20 tony stark you can subscribe to us on patreon uh patreon.com slash programming throwdown okay that was actually legit sorry i was skipping over but that is really cool i'm excited i have to move on because i feel bad about myself for not contributing to society speaking of which my tool of the show not being a tool and a way to uh not waste time to uh improve your uh your brain is escape simulator i think i mentioned before playing some escape games i recently actually picked this one up in a humble bundle i i didn't you know kind of think too much of it but it's actually like it was pretty cool and just so much content in this game um so escape simulator it's kind of what it says on the box it's an escape
Starting point is 00:44:02 room simulator and they themselves have a lot of content for levels that you can play but they also have a big fan base that is making games of various complexities for you to play and i just i really enjoyed doing it i've been i've been playing i've not come anywhere close to beating it because i refuse to you hint use hints um and my daughter sometimes sits and we play together and uh you know i personally not my kids are pretty smart but like way to be impressed with your kids is have them shout out an answer to something and you can't figure out why that is the answer you try it and it works and then have them try to explain to you like how the puzzle had that answer. They can't like, they just intuit it,
Starting point is 00:44:46 but it happens way too much. And then you kind of feel bad and you're like trying to look at the hints to like understand why. Anyways, so Escape Simulator, definitely check it out. Works on macOS, Linux, at least in Steam Deck it worked and Windows.
Starting point is 00:45:02 Yeah, this looks super fun. I'm definitely gonna grab grab this is it like uh um like a split screen or is it internet co-op uh internet internet thing so yeah that's the only thing i couldn't figure out i just have one steam account i do have two computers but once you're playing on one steam library won't you play in another one so we just sort of sit together and like collaborate on one a one player game got it yeah that makes sense very funny could you also just buy it under two steam accounts and that would probably work fine you know there's a new thing not to digress too much
Starting point is 00:45:37 but steam has like a new feature now called family mode where um you can actually say like this computer is a family member and then what will happen is you can both play different games at the same time if you try to play the same game at the same time steam says hey uh you're gonna have to pay uh you know to buy a second copy but then when you do that that you can play together so like it kind of handles all of it this is good tips okay all right i'll be back in a few minutes all right my tool of the show is the cursor ide and you can go to cursor.com to grab this so this is basically uh github co-pilot and all of these things just on steroids at the next level. So when you start up Cursor, it asks if you want to import all of your VS Code extensions. So it's literally a fork of VS Code. I imported all my extensions.
Starting point is 00:46:38 They all imported just fine. The thing about Cursor that really blew my mind was it has like autocomplete that spans the entire file. So, for example, I had this app that was a WSGI app. It was basically a kind of a REST APIi app and i was converting it to flask and so one pattern that i had to go from is the old app you would return a json object and it would have a status key and then the http status code so like 200 if it was good 400 if it was now forward etc etc um and then and then your your payload and so most of the time in these little handlers i was returning an object where you know json object for like status 200 payload is data data was some object i created right
Starting point is 00:47:41 in flask you don't return anything from your handler, the handler gets past this response object, and you do response dot status 200. And then response dot data payload, right. And I changed it, there was there was a file that had a bunch of these handlers in it. So I changed it in one as i was changing it it let me tab complete to do that like it like like i was i was putting i put like response dot status like about halfway through the word status and it figured it out it's like what you want to do is sas 200 data payload and then delete the return thing and i hit tab that was cool then it like zipped down like a third of the way through the file and it's like you want to do it here too don't you hit tab and like i just hit tab tab tab tab tab and it just did all of it and that blew my freaking mind like i've used you know copilot and amazon q and stuff and they they they do like they do have some of those magical moments like within you know pretty close to where your cursor is but i've never seen it like this where it's like oh you also want to do this thing at this file over there
Starting point is 00:48:56 uh it just like it just like kept chaining these things to do and uh um and he got it all perfect like you know there was weirdness around the brackets and all of that you know is is kind of complicated because i was adding this handler and so it's actually adding a bracket but then deleting the return bracket and it like figured it all out flawlessly um uh it was really impressive um again i think we've talked about this in the past if you're at a company make sure you get the right approval we were just talking about junior engineers we don't want you to get fired if you're a junior engineer and you get fired that will make us sad get the right approval but this is an amazing tool that's very cool is it in the works for it to be able to do it across files in your project as well like if you wanted to refactor all of your files to you know use flask or whatever
Starting point is 00:49:52 like is there a way to like invoke it to do that or not yet i don't think so i'm trying to remember uh it definitely jumps around in the same file. I don't think it has a way to skip other files yet, but it might. I've just started using it a few days ago. Okay. And what is the zip down? So like it zips down and then like,
Starting point is 00:50:17 if you don't want to do it, it just returns you back to where you are. I feel like I'll get a whiplash. Oh, interesting. That's a good point. You know uh yeah i guess i'm not totally sure so the times where it did this i took it took their suggestion every time i think maybe if you don't take the suggestion maybe you could hit the up arrow on the key and get teleported back i sent a new developer metric which is how often do you accept the ai suggestion oh definitely yeah i'm sure if they're not measuring that they should be and yeah i guess you're right they should pass that measurement along to the company no i'm saying you're like
Starting point is 00:51:01 as an engineer you get a score which is like the more you accept the ai thing the like worse off you're not adding anything they could just replace you with the ai so you need to like then there's this whole gamesmanship right because then we'll figure it out and we'll just write what the ai said but change one character so it doesn't count and then our metric for you know novelness goes up yeah well you know it's interesting i mean i think that um yeah i mean you know the next level up from this would be for me to say you know convert this kind of like raw lambda handler into flask and it would have just figured out because the thing that i had to provide was the knowledge that when you return things flask doesn't care like it's it's it's not going
Starting point is 00:51:53 to use that return value and so like just like running this code on flask means that you would just error 500 everything right and so like i had to figure that out and then once i started typing the solution then it was done um but yeah i mean at some point maybe it will that's that's almost like a segue uh or another thing is you know junior engineers now also have to compete with ai uh or or maybe maybe not maybe it makes them on a more level playing field. That's still kind of an unknown. I don't, yeah, I don't have a, I think this is a watch this space. No, no, that's not how they say that. This is like a watch the future.
Starting point is 00:52:35 I don't know what will happen. I think where I'm sitting now, but this has turned out to be wrong before, but like you said, it's sort of the intent still needs to be provided and and sort of guided but people who lean in on it probably see a big improvement before everyone else and then everyone else sort of adapts or not and it we call it the same thing but it's not really the same thing so you know it's like programming a website is technically programming and programming an operating system is also
Starting point is 00:53:11 technically programming uh they're not really the same thing like some parts are similar but lots of parts are different i think we'll just end up with more divergence so there's some things you're doing like you're saying hey i'm looking for another web framework that may be faster and it suggests to you to go to you know whatever um but then there's also times where you might just be straight up like this is something that's not been done before and it needs to be written for the first time yeah yeah that makes sense. It's going to be a wild time. I think these tools will save people a lot of busy work, which is good. But it also means that you will have to memorize the code base faster
Starting point is 00:53:58 because you're not going to be sitting there staring at it while you do busy work. All right, on to the topic of the show vector databases so patrick have you ever used a vector database no all right but you have used a database yes um do you uh have you ever put uh like anything other than text in a database you know like have you ever used a database for images or yes what did you use for that do you just stuff an image into my sql or what protobuf oh you put protobufs in the database oh yeah really so but that's just like garbage, right? Like, you know, it's unintelligible. Wow, bro. You're coming at me.
Starting point is 00:54:50 No, no, no, no. I don't mean it's garbage quality. I mean, when you look, when you run a query, it's unintelligible. Yeah, you can't index it sort of properly. Yeah, you're right. Yeah, because like when you query, I think if you certain databases they let you have like a column be a jpeg and the database kind of knows it's a jpeg and so when you query it it will actually show you pictures oh that's cool can you query oh no no it's not i was gonna say could you
Starting point is 00:55:19 query stuff about it like i'm looking for jpegs where the subject is read uh you could with a vector database oh okay here we go i'm ready i'm here i'm leaning in all right um so yeah my mind is still is still blown by the so okay hang on i gotta dive in a little bit so if you put a protobuf in a database right yeah then i guess most languages have protobuf support but you would have to like you know you know uh interpret that protobuf before you do anything with that data so like you know your your sql query goes straight to something that like converts the protobuf to like yeah i mean you can use just more like a key value store where the value rather than a relational database right so it becomes more like hey i mean you can use just more like a key value store where the value rather than a
Starting point is 00:56:05 relational database right so it becomes more like hey i have these keys and then just have the lobs got it okay cool uh okay so let's build a scaffold up to vector databases so um okay people like starting from the beginning, how do computers represent things? So, for example, you open up Notepad, you start typing some letters in Notepad, you hit save, and it saves as a text file. And so, not dealing with Unicode or anything like that, just putting that aside. A text file is basically a bunch of bytes where, you know, each byte represents one letter. So, the letter, there's an ascii table you can look up and so i don't remember off top of my head i think isn't like 97 a lowercase a or something it's been a long time but uh but but you know each letter and each symbol you know
Starting point is 00:56:58 equal sign apostrophe they all map to numbers and in at least the ascii format there's less than 256 of these uh or uh different things and so that's one byte in a computer and so your document is you know a byte for each of these letters including the spaces and all of that um so that's how you know traditionally how you could represent text in a computer so a computer obviously doesn't have a concept really of the letter a you know in hardware um that's the way it works um and for images the same kind of thing you know there isn't like any lithography uh is that the right word? But, you know, there's no like dark room in your computer. Like there's no physical images in your computer. The way it works is, you know, each picture is broken up into pixels, which is like a really tiny square of an image, hopefully so tiny you can't even tell it's a square and then for each pixel the color
Starting point is 00:58:06 could be represented as you know the amount of our red green and blue representation in that pixel and so you can imagine um you know if you have a picture a bunch of these triples your red green blue triples and you have one for every pixel um and you know the width and the height of the image so you know like what that whole rectangle looks like and that's how pictures are represented in the computer um does that make sense anything you want to add to that patrick yeah i mean i think you're right i think what you're trying to point out is the computer doesn't have a native understanding of most of these concepts so it just has operation bytes and like data bytes and the data bytes need a an encoding they need a scheme and the humans are the
Starting point is 00:58:58 one that imbue meaning to those yeah that makes sense so um okay so then we'll get into like let's say compression so for example you know you might have a document that has the same word a bunch of times it's like chapter one chapter two chapter three and so every time you have the word chapter that's going to be uh what seven bytes right to store all of those characters um but if it's always like chapter chapter chapter section section section there's like a lot of repetition there and so you could actually have a smaller file by taking advantage of all of this repetition and so one example that most people are taught in undergrad and I've long since forgotten is Huffman encoding.
Starting point is 00:59:49 I don't, do you remember Huffman coding? Yes, yeah. It's something to do with like trees, right? Like you build a tree where- And the prefixes. Yeah, maybe you explain it, because I've totally- No, no, no, no, that was good.
Starting point is 01:00:01 Yeah, yeah. So you like look in your file and you have to look at the whole file at once and basically decide like how am i going to assign bits to the most common prefixes of numbers and each branch of the tree represents sort of like going down that way and so it can represent a bucket of the prefixes that get concatenated together and ultimately that way most the most common pieces use the fewest bits and then the longer deeper parts of the tree encode the the less common pieces yep that makes sense and so um and so you could do something similar for images right you could have some kind of delta
Starting point is 01:00:43 encoding or other kind of things that take advantage of the nature of the image. But you know, for images, really, you know, you want to capture the sort of nature of the image, right? So it might not matter that like this single pixel in this one part of the car is like a little bit more blue in the image than it was in real life like it doesn't that doesn't really change the nature of the image and so now you start getting into what are called lossy compressions or lossy encodings so for example one of the most common uh is where you take these um uh various kind of repeating patterns so such as like a fourier cycles or cosine cycles you know the cosine wave right you take these kind of waves
Starting point is 01:01:38 and you compose them at different frequencies and amplitudes on top of each other so um um so for example uh if you imagine like a zebra so zebra is this like really sharp kind of wave where it's it's white and then it's black and it's white and it's black and so if the camera zooms in on the zebra now the sort of frequency is getting smaller right because those stripes are getting bigger the camera zooms out or the zebra's running away or something now the frequency starts going up and so what you do is you you you have a ton of different uh waves of different frequencies and you have there's an algorithm that tries to figure out how you can compose like add many of these waves together to faithfully
Starting point is 01:02:36 reconstruct the image and if you can do that then you only need to store like the definitions of those waves like what were the amplitudes and the frequencies of those waves um instead of storing every single pixel i mean i think just to completeness there you need infinitely many but if you have some lost criteria like some amount of thing you're willing to give up, then you can basically chop and only take the top X most important frequencies. And then you'll get back most of the image. Yeah, yeah, exactly. That's a really good segue to the next part.
Starting point is 01:03:18 So now we'll jump into embed embeddings right so um we talked about ways to kind of explicitly represent something so if you have a document and you want to store it you explicitly represent you each individual character um but many times what we want is to know like the overall essence of a document, like, is this document about cars? Uh, or is this a long document or does this, is this document at a really high reading level or at a kindergarten reading level? So these sort of like soft attributes, uh, those sound like kind of yes, no questions. But really, there's a lot of ambiguity there, right?
Starting point is 01:04:09 It's like there's certain degrees of like, how much about cars is this or, or how much of this document is at the kindergarten reading level. So, so, you know, these kind of like soft attributes. And then doing things like give me all the documents that are roughly a yes for this question. So it's kind of like a lossy attribute system. And the best way that we have today to be able to answer questions like that is through embeddings.
Starting point is 01:04:40 So the goal of an embedding, it's not directly to recover the original document, as we talked about before. So you know, with Huffman encoding, the goal is to compress something so that you can later decompress not to get back the original thing the goal is to be able to answer questions around nearness you know if i have a query or or or a you know a concept or the query might even be another document right what are the things near that query um semantic nearness that's the goal and so because of that embeddings can be you know very very lossy because they don't really have to reconstruct the original content and so there's there's two kind of key ways that embeddings are created so one way is through contrastive methods so the idea here is i might say, these two documents were written by
Starting point is 01:05:49 kindergartners. So I know I'm going to be asking a lot of questions about reading level and reading skill for these documents. So I have documents created by kindergartners, I have documents created by adults, and I randomly pick two documents. If they were both created by the same age group, then I want their embeddings to be closer together. If they were created by a mix of a kindergartner and an adult, if one document was a kindergartner and the other document was an adult then i want to pull those two documents apart in this space and so uh and so at the very beginning of your embedding process all the documents are just randomly projected and so they're just scattered all over the place but after you ask yourself and answer this question many, many, many times,
Starting point is 01:06:46 you end up with hopefully, you'll end up with sort of two, you'll end up with many spaces, many clusters. But each of those clusters will be either a kindergartner document cluster, or a grown up document cluster because of all of this pulling and pushing. Does that make sense? Yeah, I mean, I think the pulling and pushing... Now, does that happen in a subset of the dimensions, or are they guaranteed that even in another dimension, they may be far apart? Yeah, so when you pull or push, you do it in all the dimensions. So it's almost like there's little
Starting point is 01:07:25 thrusters on these two documents, there are points in embedded space, and those thrusters are pulling them together in every dimension simultaneously, or pushing them apart. And so when you do this embedding, you know, a single dimension has no real meaning because the only meaning was constructed from the distances, the pairwise distances. So you can't say dimension three is how many times they said the word dog or something because it's all like a composition of all these dimensions. So you need all of the things you're trying to do this
Starting point is 01:08:07 with up front though um you need uh okay so when you're when you're training you provide a bunch of training examples and those have to be labeled right so you have to know this is kindergarteners adult um once you've trained the model then you can give it new documents. And you could even ask it like, is this new document a kindergartner or an adult document? And then it would use a vector database to look at like, what are the nearest neighbors?
Starting point is 01:08:38 And if most of those are kid documents, then that's one way that you can answer that question. Got it. So that's one way that you can answer that question got it um so so that's one way you give it a bunch of pairs um of things that you know are are you know should be close together or should be far apart um there's another way to do it where it's kind of like a fill in the blank right so for example um you might here's an example let's say i have three documents written by a kindergartner and i give a model the first one and i give a model the third one and then i say hey generate the second one um maybe maybe you ask a kindergartner you know
Starting point is 01:09:34 write a book about dogs write a book about goldfish write a book about cars and you present the ai that you're training you know two of three books, and you say generate the third one. And in the beginning,ner and you go to the ai and say hey you were wrong like here's the actual answer um and they do the same thing with like adult books right and and it turns out that like if all you're doing is kind of filling in the middle whether it's the middle of a sentence the middle of a collection of work or whatever it is because you're just filling out the middle and you already have the middle like you know the right answer you don't actually need any humans in the loop so like i could just take all of wikipedia and i could give every sentence to some ai that i'm
Starting point is 01:10:38 training and say like hey like fill in the beginning of the sentence or the end of the sentence or the middle and and I know the right answer. And so when it doesn't give me the right answer, I kind of push it, give the right answer. And so this is called self-supervised models. if if i give it the end of something and tell it to reconstruct the beginning then when i actually try to use this model in real life i have to give it the end of something but if i'm using this to do like chat gbt i can't do that like you can't say all right you're given the end of what chat gbt wants to say, give me the beginning. It doesn't work that way. And so usually you have to, you're forced to go in the other direction. So you're
Starting point is 01:11:32 like, you know, you purposely hide the ending completely. You artificially hide the middle and you give the beginning. When the model tells you the middle, you correct the model. And so because you're never able to look at the future, these are called forward models, because they can only take the past and predict the future. And so GPT is an example without the chat part is an example of just a pure forward model. So, you know, given like a bunch of things you said, this is your context, what's the next thing that that's going to be said, and it's trained on, you know, terabytes and terabytes of text. And then now it generates that token. But along the way of generating that token, it constructs an embedding. So it constructs an embedding first, and then it uses that embedding to predict the next token. And so if you're not interested or not, you don't need that second part, you can just take the first part.
Starting point is 01:12:47 And now you have an embedding of the words that have been said so far. And so this is why as like a byproduct of making these chatbots, the companies can also offer an embedding service is because they need that internally. So you can offer it externally. And it's valuable for the reasons you're describing, is now if i have text maybe it's not a chat it's just i have text and i want a representation of it that i can ask questions about then i can get the embedding and then do whatever i want with it yep yeah exactly right. And there are vision transformers that work very similarly where you take a picture, you hide part of the picture from the AI, and you ask the AI to recreate that part of the picture. When it does, you have the real answer because you are the one who hid it yourself.
Starting point is 01:13:42 And you compare the AI generation with the reality and you correct the AI. And similar to language, a vision transformer, you know, has a step where there is an embedding representation. So you can take a picture and run like half of the vision transformer and now you just have this embedding where the picture is going to be car and asked the AI to redraw the car, all those pictures are going to have very similar embeddings. But the picture where like you deleted a person from who is in the background of your family portrait, and you wanted the AI to like fill that in with something.
Starting point is 01:14:42 All those pictures will occupy like a different space in the embedding um so um yeah i mean this is you know this gets pretty kind of difficult to explain over over the air but any questions about that part of it no i i mean i think it makes sense so the idea is you're getting a vector which i guess we talk about like it's a list of numbers and those numbers represent like where in this space it is and the hope is that for these processes that you learn the ways in which things can be similar or different right right and so um okay so now let's talk about how to correct the ai so let's say you let's say the sentence was you know the quick brown fox jumps over the lazy what is it lazy
Starting point is 01:15:40 dog or something yes yeah so let's say the ai generates the brown fox jumps over the lazy dog so got everything right except for the word quick right but because it skipped the word quick every word after that is kind of wrong right it's like shifted right so you have to develop all sorts of different uh metrics to be able able to say how wrong the AI was in a way that helps it learn well. That's at the training side. It turns out the embeddings, you're often trying to use embeddings for a different purpose. For example, you might want to fill in the missing image, the missing part of the image when you're training. But then when you're actually using the
Starting point is 01:16:32 model, you want to group all of your images together. So someone can say, hey, this is a picture of me at the beach, find more like this one, which is different than when you're training it so we call this training serving skew it's when you use a model for a different use than it's it's uh what it was trained how it was trained um and so what that means is the similarity metric on the embedding like there's a lot of unknowns there like you know like it might be that the model was just trained to fill in parts of the image and so it actually does a terrible job of of finding similar images it might just be a coincidence right that it does a good job right so so there's not a lot of theory at this point and when there's not a lot theory, the best thing you can do is try a lot of things. And so, you want to have a variety of different similarity metrics on the embeddings.
Starting point is 01:17:32 So, a common one is just how close something is like in Euclidean space. But then there are a bunch of others too. We don't need to dive into all of them, but there's a ton of different similarity metrics. And so, you know, once you've chosen a similarity metric, then you can say, given a point in this space where the point can have a document on it or not, find the nearest neighbors, find the nearest documents to that point um and uh um and then you can look at those results and see okay does that match what i was hoping from a product perspective so the point that you're making and getting neighbors like a query but the query is not uh doesn't have to be a text like you said could be a picture it could be whatever it gets the embedding and then the goal well maybe
Starting point is 01:18:29 i'm jumping jumping the gun here but the goal of the vector database is to help you efficiently find the things closest to your query yeah yeah exactly right yeah so um just like we talked about with the fourier transform as pat Patrick said you know you'd need an infinite number of waves to get you know a perfect resolution but you're often willing to tolerate you know error um similarly you know if you're willing to tolerate error in finding out which documents are the closest then you can do what's called sublinear approximate nearest neighbors so the idea is if i have a million documents i don't need to check the embedding of all million of them to get the nearest neighbors to a point. I can use a variety of different data structures to say, okay, I'm extremely confident that I have the nearest neighbors, but I'm not 100% confident.
Starting point is 01:19:34 So for example, maybe there's a partitioning system. And if you fall like exactly just to the left or right of a partition boundary you might miss what's on the other side of the partition um and that might happen one out of every 10 million times and only affect one result and so you're you're more than happy taking that penalty in exchange for just massive massive speed up so imagine like uh we're talking like going from like a million seconds to search to six you know or something like that so um so that's basically what the vector database does so the vector database has it so where you provide your own embeddings it doesn't do that part but you jam a bunch of embeddings into this
Starting point is 01:20:26 vector database, it creates all of these data structures, and then you can give it vector queries, and it will tell you the nearest neighbors. And the vector database is responsible for managing sort of the dynamics there. So for for example if you start removing documents and adding different ones at some point the vector database might be like the data structures might be kind of stale so i'll give you an example like an extreme example let's say you add a ton of vectors, but only the first dimension isn't zero. So the vector database is basically going to create a bunch of partitions around that first dimension and ignore the rest, right? If you take those documents out and insert a bunch of documents where only the second dimension isn't zero, but you still use the old partitioning scheme they're all going to fall in the same
Starting point is 01:21:25 bucket and now you don't have that speed up anymore and so the vector database is responsible for like knowing when to rebalance and re-index just like your hard drive has to occasionally rebalance the b-trees on the on the folders of your hard drive what's old is new again yeah exactly um so um yeah and then beyond that just all of the traditional database things you know backups and restores um all of all of that stuff that we've come to appreciate from databases uh you know vector databases have to provide that as well so if you have like you mentioned so you like euclidean distance or you hear like cosine similarity or whatever if you want to change the metric do you need or does per metric is there needed a different clustering
Starting point is 01:22:16 or is it like one is good for all of them yeah it's a good question. You basically need a different... Well, the answer is always going to be it depends on the data, right? But definitely, you can construct datasets where each clustering, each similarity metric needs its own clustering and you can show that you know uh that there is a data set where you know any clustering can only be good in one of these three different similarity metrics so it's possible now in practice in practice i wouldn't be surprised if these database systems just use euclidean distance for the clustering and then you pay a slight penalty if you're using something else. I think for 99% of cases, that would be fine.
Starting point is 01:23:11 Makes sense. Yeah, and so this is where, this is definitely in the category kind of like authentication. It's in the category of things you don't want to write yourself. Oh, come on. It's going to be pretty gnarly.
Starting point is 01:23:27 You know, even some of the top vector databases are only now starting to become mature. So, for example, you know, I've used in the past Milvus, which is an open source vector database and um we had issues where when we tried to back up the database it would crash and we would lose everything um and so um you know definitely i would say it's still at the phase where you probably want to store your embeddings just in a regular database as well um it's still kind of early. But you know, every month, it gets massively more mature because so many people are using it. The one that I so Milvus is good. pinecone is a enterprise alternative pinecone is a lot better, but you're going to pay for it. You get what you pay for there. Another one that I really appreciate is PG vector,
Starting point is 01:24:29 which is an extension to Postgres that lets you have vector columns in an existing table. And that's really nice, right? Because you could have, you know, like a row, have like, imagine you're storing houses um your zillow or something right so you could have a row and the row has like an id the address of the house uh the number of bedrooms like all of this explicit information and then in that same table you just add a column that's like you know embedding and that column is backed by pg vector
Starting point is 01:25:06 and it works pretty naturally so you could um this is actually another thing you know the approximate nearest neighbors it assumes that they're all valid right but like uh for example let's say you only want three bedroom houses that are approximate nearest neighbors right that's actually a really hard thing to do because you might pull like the nearest thousand neighbors but they're all two bedroom i have to pull another thousand another thousand another thousand until finally you have enough to fill up the three bedroom limit that you set so um there's a lot of complexity there um but uh but yeah you know these these database systems are really good at at handling all of that for you yeah i i would love to like find an excuse use this. It always sounds really interesting and cool. And I know I'm obsessed with tinkering.
Starting point is 01:26:07 So I'd love to find a use, but just haven't had one yet. Yeah, I mean, for me, the one use case that I've found outside of work has been nearest neighbors on my photos. It's not too hard to use one of these open source models, embed all of your photos, and then given one of your family photos, you find all the nearest ones. Google Photos, these other things will do it for you.
Starting point is 01:26:37 So it's really more of a fun exercise for you to do than something that's providing a lot of unique utility. But that's been a fun thing that I've been able to do that's something that's you know providing a lot of unique utility but um that's been a fun thing that uh that i've been able to do with it cool all right um that is a ramp up on vector databases if you're doing anything with vector databases let us know give us a shout out go on the discord uh the discord is starting to pick up some steam a lot of really interesting discussions about career uh changing jobs should i be a consultant after i leave my company a lot of really interesting discussions there so check it out uh okay i will i will i will take that as a personal message
Starting point is 01:27:22 well i got you covered, Patrick. I will be there, but it'll be a bot. That's right. Patrick's AI will be there and you can interact with it. This is awesome. I learned a lot. So thanks, Jason. And thanks, everyone, for listening.
Starting point is 01:27:39 Cool. All right, everyone. Thanks so much for supporting us on Patreon. We really appreciate it. And we will catch you all later. Have a good one. and share alike in kind

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.