Programming Throwdown - Image Processing

Episode Date: April 1, 2019

If you use ASCII encoding, the entire Oxford dictionary is about 5 million bytes. A single 4K image contains 25 million bytes. If you watch a 4K video running at 60 frames-per-second, over 30...0 dictionaries worth of data are going through your tv every second. Let that sink in for a moment. One of the most magical areas of engineering is image processing. Everything from the way the images are stored to advanced AI techniques like face recognition have mind-boggling complexity. In this episode, we scratch the surface of image processing, but if an area from this show interested you and you would like to learn more, let us know! Show notes: https://www.programmingthrowdown.com/2019/04/episode-88-image-processing.html ★ Support this podcast on Patreon ★

Transcript
Discussion (0)
Starting point is 00:00:00 programming throwdown episode 88 image processing take it away patrick one of the things i've noticed has become a what do you hot new, well, it's not that new. A motif. Is the Netflixification of everything. Yep. That's my 20-letter word for the Netflix. Yeah, I remember when Steam first came out, I said, there is no way I'm going to buy a game on Steam where they could take it away from me.
Starting point is 00:00:41 That seemed like an absolutely insane thing to do. And now, you know, it's just that's all like an absolutely insane thing to do um and now you know it's just that's all i use so what a huge change well now it seems like the world and i guess i have one thing in mind we'll talk about here in a second but the world has changed even more so one thing is owning digital assets uh and i'm not trying to get into blockchain discussion um but yeah but things like steam or kindle or you know but once you sort of buy it assuming the service stays up you can continue to use it i actually find myself reasonably okay with that like i understand the risks are probably not as well understood as we think they are but but i kind of understand it and things are much much more stable now
Starting point is 00:01:20 than uh than they were i mean when steam first came out, you know, I wasn't confident that it would last. But at this point, you know, I think we could feel safe that Steam will be there in our old age or whatever. Okay, well, you have more confidence than me. Really? Okay. Yeah, well, so the thing I want to talk about, though, is sort of the next step that I don't know, I've not adjusted adjusted to as much which is the sort of renting of everything so on netflix if i stop paying i lose everything um and now there's uh same thing i guess uh both sony and microsoft i think i've tried the xbox one what is it called i don't ever remember i'm not a big uh sort of video gamers probably not the best person to talk about it um but they have xbox gold i think well you get it you get games that you can rent yeah so the one where you pay is game pass or something isn't that what it's called oh okay
Starting point is 00:02:13 and you pay a flat monthly fee and then you're able to effectively just yeah like netflix rent or even says it's described as netflix for video games uh you can rent as many games as you want you just play them whatever game you want to play, it installs down. But as soon as you stop paying your monthly fee, you just lose all of those. And they can lose access to it, right? So like if their licensing deal goes away,
Starting point is 00:02:36 then they can no longer do it. And to me, I don't know. I guess it just depends on the kind of gamer you are. For some people, I'm sure this is a great deal. Like if you're a person who routinely spends more than, I don't remember i i guess it just depends on the kind of gamer you are for some people i'm sure this is a great deal like if you're a person who routinely spends more than let's i don't remember the exact price but say it's ten dollars a month if you routinely spend well in excess of 120 a year on video games that would not no longer be spent and instead be replaced with this kind of service probably makes a lot of sense but for me i only play video games which are typically many years old and are less than 20 and i don't buy them but maybe two or
Starting point is 00:03:10 three times a year then all of a sudden it's a i'm spending more but i'm getting a lot more um yep i think there's also like a psychological part of this as well which is um you know think about like a photo album right so it's like you take the photos you put it in the album and for that moment uh you know you really enjoy it and then maybe you take it back 20 years from now just to look at it right and and there are some games like that where you just go back and look at the achievements like maybe you just go back and play pac-man for fun or something and and you just you i i guess when you go in that rent mode you kind of go into it saying okay you know this isn't
Starting point is 00:03:50 permanent this achievement isn't permanent because eventually i'm going to stop this service and it's going to disappear but now with the future it seems that the current trend is about to be so i'm thinking about the google stadia announcement previously there had been uh both microsoft and sony have said they're going to do it as well there's been already existing services and pilots to try this uh the one i always think about it was the on live service which went belly up and then got bought by i think ultimately sony um for their technology but this is where instead of having a powerful video game console in your house, you have a cluster farm of powerful video game systems.
Starting point is 00:04:31 And then all you're doing is streaming the graphics from those to your house and the controller input to the cloud. The so-called thin client, I guess, is the sort of paradigm. And there, owning things becomes even more tenuous because if I simply bought a game, there's recurring costs of running the servers in the cloud right and this becomes this kind of hard arrange a weird arrangement and so it almost uh if this
Starting point is 00:04:56 really is going to be the future you almost have to go to a purely rental model in my opinion because like you said if you want to just in 20 years decide to play your Pac-Man game, but Pac-Man needed this specialized hardware, which is only ever living in a server farm, what are you going to do? You need to pay a monthly recurring fee or whatever to have them keep running those servers and keep the lights on. And so that has to sort of be bundled in with the thing.
Starting point is 00:05:22 It's a, what is it? It's exciting, but a little trepidation as well because you're you're sort of subscribing to keep paying this fee and as soon as you stop yeah you don't really have anything you can't decide to save your budget this month and just play whatever you already have yep i think another one really good thing about this though is you know the the video the the economy of of being a video game developer has just been getting worse and worse and worse right um you know think about a video game costs the same price i mean i remember i bought mario 3 for nintendo my parents bought it i was i don't know eight years old and it was something like 60 bucks. And now a game costs 60 bucks.
Starting point is 00:06:06 And, you know, think about like zero percent inflation on the price of video games after, you know, 30 years or something. Right. So so, you know, the the I don't see how these companies can really stay in business like that. And I think moving to a recurring model, you know, might be able to sort of save the industry. I will say the technology of it seems really fascinating. Again, I'm not trying this myself, but everyone reports it works, in their opinion, pretty well. But the that I, you know, use my controller on Wi Fi, and it goes into the cloud, and then it makes decisions based on that input, and then sends the video back to to me like if that if you can really get that working smoothly that seems crazy like that sounds just insane like it sounds like physically impossible i mean i don't yeah i've not done sat down and
Starting point is 00:06:57 done the physics i'm not sure maybe for certain kinds of like high speed reaction video games it is not gonna work but i mean people see people have tried and said it looked pretty cool like i've not heard you know there always is going to be this risk of i guess uh unpredicted lag where all of a sudden everything just acts really bad um but i mean you experience that today in online games anyways yeah that is a good point but the difference is with the online games is um they do something called dead reckoning where you know you can even see this you can unplug the the your router and the game will still let you walk around as if as if you're you're connected um but just everyone else is
Starting point is 00:07:37 frozen and so even in network games you might have a lot of lag it's still your response like the feedback you get for yourself is instant um and so now that's gone right i think uh you know the one thing that i think might make this work is if is if this the server is actually running like uh in your isp so if you know these companies have a deal with uh in our case it's comcast but you might be like on a votaphone or bright house or's comcast but you might be like on a vodafone or bright house or whatever it is but you know if all you have to do is reach your isp and back um you know then i think it can be really low latency even like maybe 10 milliseconds yeah i i mean i guess we'll see but uh there's more been more and more announcements everybody
Starting point is 00:08:22 is at least it sounds like all the major players are at least going to make a pass at doing this. We'll see what happens. I mean, all the TV manufacturers made a pass at making 3D TVs. And I was thinking about the other day, you know, I don't even know, can you buy a 3D TV anymore? Yeah, I remember when that was a big deal. You know, I think 3D TVs died relatively recently because I remember about a year ago that was still a thing and now
Starting point is 00:08:46 they're just completely gone i mean i know you can do 4k tvs like that's pretty standard now yeah that that's i mean that's just a straight upgrade so so that's so yeah so yeah so okay so now 3d tvs so but yeah i think 3d tvs is doa should we have a moment of silence yeah people oh have you tried getting on a tangent here but have you tried the um self-contained vr like the the oculus go yes or i think there's there's one from google yes i have tried the oculus go i got it as a christmas gift combined in with the rest of my family for for our father father. Uh, you know, the, the only thing I don't like about the go is it, um, it doesn't have, um, you know, translation, which, which is actually the most important thing. So it's like, you know, so just to explain what that means, like you could turn your head from side to side and, and, you know, it makes sense.
Starting point is 00:09:40 Like it, that feels natural, but if you take a step forward the system doesn't know you did that and so nothing will happen um but it's very close i mean if i don't know anything about about this kind of physics or how this would work but if they could make it so that um you know somehow it's a self-contained system maybe there's antennas sticking out or something but it would like uh track you your your movement as well. That would be really cool because you could you could, let's say, like map your house and then you could run around your house playing laser tag or something wild. Right. Well, that's the I think the HoloLens, the new HoloLens demos I've seen and the Magic Leap work like that where you you're still tethered. But yeah, there's not like if I recall, Craig, I don't think there's an external camera.
Starting point is 00:10:28 There's sort of cameras on the goggles. And your first step is to sort of look around the room so that you're the cameras can figure out sort of the pose and your position, you know, from looking at the room and identifying features. Yeah, that makes sense. I think that would be really, really fun. Maybe a little dangerous, like you fall over the couch or something, but it could be really fun.
Starting point is 00:10:51 Well, those two are augmented reality. Well, I always mess up the names. There's augmented reality. And then what's the other one? Virtual reality. Mixed reality. Oh, oh, I see. Yes.
Starting point is 00:11:01 And then there's also virtual reality. Yeah, I haven't actually tried anything other than just pure virtual reality where, you know, you're on the moon or something. So wait, so bringing up the so your tangent back to the original topic. Is anyone talking? So moving the Oculus Go, it doesn't plug into your computer, which to me is a huge difference that one, it's lot cheaper um than a computer rig plus the vr goggles and the second thing like it doesn't plug into the computer but i guess that obviously limits the current graphics capabilities and stuff but yeah has anyone talked about like thin client vr or is the like any lag i imagine is just makes that kind of horrendous i don't know to be honest i mean well
Starting point is 00:11:41 well so one thing the um one thing i've wondering is, you know, I wonder if the thin client is maybe too extreme, but there could be some middle ground there. Right. So, for example, something could maybe pre render parts of the image and get that down to you within, you know, 20, 30 milliseconds. All you have to do is just that one rotation. So basically you get all the geometry. The only thing the client has to do is handle the player movement in the very short term, and everything else could be done off-site. Interesting. So, yeah, it's like if you can just distort, roughly like distort the image for the couple meters of movement. Yeah, yeah. But, yeah, that's some really cool stuff coming down the pipe.
Starting point is 00:12:27 All right, time to news. So I think I got the first article. This is a kind of interactive game for explaining multi-threading issues. I played the first couple levels and I passed them. Woo! Nice. Not that hard um but a series of tutorials uh kind of going through um various issues with deadlocks and multi-threading problems and when i when i sort of saw the article come across like oh what how would you even and i i just expected if you've ever done these like capture the flag style
Starting point is 00:13:04 competitions or online things these puzzles that you, you're like download something, run something. I thought it was gonna be a fairly complicated setup, but I was pleasantly surprised. of two sets of code and a thread running on each and sort of, you know, how would you step the pieces of code in such a way to cause a problem to occur? And it is sort of trying to teach you a little bit about how to do it. And so you could have a critical section or a lock or a mutex and, you know, stepping the one, getting to it, getting to the other and understanding that some instructions actually are sort of multiple instructions in reality so if i say you know in c++ you might say counter plus plus but counter plus plus is is sort of actually two instructions
Starting point is 00:13:58 it's first you know get the value of the counter then add one to that value and then write it back um and so it's actually not a single atomic operation and so understanding that and expressing that this sort of goes through it and i thought wow actually i wish i would have had this before um and so you know thinking through that having these counters i i hopefully i you know i'm gonna actually go do some more of it afterwards because now that i actually sat down and tried it i kind of interested to see how far i can get and how hard it becomes because it seems a little bit and we've recommended those games before but the the video games like what is that most recent human resources was one um yeah and there's others where you're like doing sorts of programming tasks and they seem really easy at first but
Starting point is 00:14:43 then it's like oh man this is just like my day job only with a very weird programming language. Yeah, that's the only game where I've actually gotten all the achievements. Oh, really? Yeah, the last achievement is so hard. There's one where you have to build a sorter, but the rules are so tight. I think I had to unroll a for loop. Or even, it's worse than that. You have to unroll part of a for loop.
Starting point is 00:15:10 But it was a lot of fun. I highly recommend that game, Human Resource Manager. I feel like we talked about it once before. But if, yeah, always worth talking about it again. Yeah, definitely, we've mentioned it. Yeah, yeah, I'm not an achievement kind of guy. So I rarely, I don't think I've ever gotten that. Me too, yeah, it's the only time. Only time I've even paid attention to the achievements let alone got all of them but
Starting point is 00:15:29 it was a very very satisfying game but anyway so yeah check this out so uh i not said what it is this is uh it's called deadlock empire my guess is it's probably searchable by that term oh here i can try deadlock empire uh but it's on g. And yep, if you search Deadlock Empire, it'll show up. And then, of course, it'll be in the show notes. So the game is actually hosted on GitHub. Is this like JavaScript or something? Or you download an app or something? No, there's no download.
Starting point is 00:15:56 It's just a web thing. Yeah, sorry, poor explanation. Yeah, that makes sense. When I expressed it to human resources, I definitely lost the context there. Yeah yeah so it's hosted on github it's a series of web pages i don't know the exact technological magic that they did to make that yeah that's that's uh yeah it's got a yeah probably some javascript engine or something all right my news is the godfathers of ai win turing awards this is pretty cool um a little bit of backstory here um so there was a time believe it or not where
Starting point is 00:16:35 um neural networks were really really shunned so neural networks actually went through uh we call the ai winter and there's actually two AI winters. But basically, people just kind of really hyped them up. And then they just failed to deliver twice. There are both times there were some some, you know, pretty serious limitations and, and it gets, you know, technical pretty quick. But basically, you know, people who, you know, had really good reasoning said, you know, we need to, but basically, you know, people who, you know, had really good reasoning said, you know, we need to be doing everything with neural nets. When you do image processing with neural nets, uh, you know, translation can be done with neural nets,
Starting point is 00:17:15 et cetera, et cetera. And I was actually one of those people. Um, I came into it much later, but my advisor, I guess I should say it was really one of those people, uh, saying everything you see with neural nets, they're so powerful. But, you know, after years and years of them not delivering, people just gave up, which is which is very natural. But not everyone gave up. And so, you know, Sammy Bengio, Jan LeCun, and I think Jeff Hinton are the three people, although there's others. They continue to advance the state-of-the-art neural nets, and they finally got that technology kind of off the ground. And part of that was a rebranding.
Starting point is 00:17:56 They switched it from saying neural nets to deep learning, which really just served as a way for people to try them again and not be bogged down with all of the baggage of all the failures of the past. But under the hood, there's also some technological advances and things like that. And so these folks won a Turing Award. For people who don't know, Turing Award is the most prestigious, you know, computer science award. It's basically the Nobel Prize of computer science.
Starting point is 00:18:24 It's just very well deserved and uh you know honestly i never thought i would see uh i would see something like that um but uh i think it's amazing how far the technology has come it's really cool congratulations yeah totally um breaking news i just saw uh clicking through this article that valve just announced they're making their own vr headset called the valve oh man i'm not surprised because they have the steam vr already so it won't be breaking news by the time the episode comes out but it's breaking news right now release all the episodes people who are watching the show live uh so by the way if you're on our discord um i make a uh a little announcement
Starting point is 00:19:07 before we go live um if you're on discord it sends a little push notification to your phone um obviously there's a bit of serendipity there um you might be asleep or you know you might not be available to just jump on discord at that moment but some people are and so if uh if you can watch the show live then you can find out about this right as it happens yeah uh so my next uh article news it's neither of those is a little device i came across uh reading i think some reddit threads called the odroid go and the odroid go okay i think we've talked about uh using a raspberry pi for doing emulators before um and actually came across it in that context that um you know hey there's this thing that looked vaguely like game boy um it's like a non-descript pocket console um oh yeah i see it
Starting point is 00:20:02 now uh and um you build it up from a kit i think it's like 40 or 50 us dollars it looked like there was some on ebay um like with all these things it's always sometimes a little bit hard to find out exactly how you would acquire them and it has odroid in it which i'm not exactly sure yeah i don't know what that is places and it has like a sort of android logo but i don't think it has anything to do with android um but what it does go does allow you to do is it is an emulator and i guess that's cool you know having played with that you know be legal but you know um emulators and the thing that i saw it for is that the chip on it i had come across before is this chip called the32, which is an ARM core 32-bit processor, but it's famous for being like a really cheap Wi-Fi device.
Starting point is 00:20:50 And I don't know if they have the Wi-Fi ability in here. I assume they do. But what they have for it is the Arduino IDE. So you're able to use the Arduino IDE and the Arduino libraries, which sort of act as kind of like a wrapper around a lot of pretty difficult things to do normally with embedded programming. And what I got sort of interested in was with this hardware and a screen and the thing that always takes so long figuring out how to sort of hook all that together with example code. This becomes what, if you've heard
Starting point is 00:21:23 Jason and I on many episodes in the past we've talked about how we got into programming and a lot of it involves and this is common with a lot of people i work with programming on the calculator programming in in q basic or some form of basic where there was like a pixel and making some form of graphical interactive you want to call it a game maybe it was a game maybe just a little sort of playing with with cool procedural graphics simulations yeah exactly um and here's a device that is you know but i don't think people can do that anymore because you can't from your from your phone really do that you hooking up your phone and getting the whole uh you know
Starting point is 00:22:00 android development environment or x is actually a very sort of large task, like even to do relatively simple things. But here, I think this, if I had to look across, this seems pretty close to how I felt about developing sort of little interactive games on my calculator, where there's just really simple low-level ways of just writing out text.
Starting point is 00:22:24 And you have, you know, just the sort of Game Boy-style buttons. And the kind of good thing here, it looks like I don't have one myself yet. I'm thinking about getting one to try it out, but it looks like there's enough examples. So, for instance, they've implemented Flappy Bird in the Arduino stuff, and that feels like something you could go in and modify and sort of have enough starting like something you could go in and modify and sort of
Starting point is 00:22:45 have enough starting there where you could get going without worrying about, you know, the tons and tons of overhead setup. Um, if you have a cell phone, I think you can do that too. I feel for me, almost, this is one of those things. I'm not an artist, but you hear artists talk about it. Sometimes creativity comes from the sort of restrictions, what you don't have, the more freedom you have it almost becomes pretty difficult it's a focus every restriction is some type of focus yeah and so i i feel this is restrictive enough environment whereas like a phone you might be tempted to try to do something sort of insane with multiplayer gaming or and it's just like here
Starting point is 00:23:18 it's like look this is ridiculous you're not going to do that uh and so by that you're limiting what you are able to do and i think it would be a pretty pretty cool thing because i know for me looking back at how i got started and other people i think having access to something like this would have been like amazing um yeah totally so i looked it up and the odroid is it looks like a competitor raspberry pi okay so it's basically it's running you ARM processor. It looks like it supports Arduino and all of that. But it's way heavier weight than Arduino. It's like 4 gigs of RAM.
Starting point is 00:23:52 It's basically the specs of a Raspberry Pi. with sort of like an ARM actually Raspberry, yeah, yeah, an ARM instead of i3-86. Also, the Odroid looks like it has a much better GPU than the Raspberry Pi. But yeah, I mean, so running Linux on it and doing something,
Starting point is 00:24:10 yeah, maybe, but I don't think, to me, this Odroid Go one, you know, they're showing here examples in MicroPython where you can blink the LEDs and put text onto the screen that's included and bundled with it. Here's a way to, like, yeah, oh, here, here, have an access point server where you could basically go to the device and toggle an LED on the device from your computer.
Starting point is 00:24:31 I mean, this seems like pretty cool stuff. Yeah, totally. So pretty easy way to get started. That is awesome. Yeah, it's not too expensive either. The base model is about, it looks like $49. So it's not too bad. You know, I mean, worst case,
Starting point is 00:24:44 if you're tinkering out there, you fry it, you're not losing hundreds and hundreds of dollars. So all right, my news is, this actually blew my mind. And I feel like, you know, I know a lot about terminal programming, I wrote eternal terminal. So I don't, it's just, I don't think I know everything. But I know a lot about terminals and how they work. And this completely blew my mind. So there's something called SIXEL, S-I-X-E-L. And those are ANSI codes you can type in to draw images on the terminal. So like right now on your terminal, you could just draw a picture of a cat.
Starting point is 00:25:23 That blew my mind. The idea of just being able to just draw a picture of a cat. That blew my mind. Like, I mean, just the idea of just being able to just draw a picture. I mean, like, it just shocks me, actually, that, you know, just more console apps don't do that. Like, why don't console apps just draw pictures, you know, instead of doing ASCII art, right? It is very cumbersome and kind of weird looking. So that's probably part of it.
Starting point is 00:25:47 I mean, by weird looking, I mean the API. But this GitHub repo is pretty cool. It's GNU plot through Sixel. So GNU plot is a open source library for generating plots. You could plot line graphs, bar graphs, pie charts, et cetera, et cetera. And now you can literally just run this command and you get a bar chart in your terminal. So not like a bunch of pipes that look like a bar chart, but you get a picture of a bar chart, which is really cool. So I haven't tried it yet. I'm going to probably try it tonight. But if it works, that's pretty amazing.
Starting point is 00:26:26 Yeah, no, I'm like wanting to try this. Yeah, isn't that mind-boggling? Like, how did we not know this? It's just like, you just have to type a bunch of weird Blackmagic characters. And I'm sure not every terminal supports it, but I would bet, you know, most of the common terminals do. Because, you know, this technology, terminals do um because you know this technology even this as foreign as it as it sounds i'm sure it's been around forever and uh yeah you just type
Starting point is 00:26:51 these characters uh in a certain order and you end up with like you know a purple dot right on the terminal it's pretty wild cool it's time for book of the show. Book of the show. My book of the show is Influence, The Psychology of Persuasion. So someone recommended actually a different book about persuasion, which I just finished reading. That book was OK, but even that book, the author suggested getting this book and this book seems to be, seems to be like the gold standard of, of understanding persuasion. My guess is this one's going to be a little bit, or actually a lot more in depth is definitely
Starting point is 00:27:37 longer than the book I read. Um, so I figured I'd go straight to recommending the good thing. Why would I make everyone else have to read two books? So, um, I found the, the first book I read really, really fascinating. Um, it's also kind of weird. Um, I mean, just persuasion in general. So, well, one thing is, is, uh, uh, the listener, you folks at home don't have to worry. Uh, I'm not persuasive at all.
Starting point is 00:28:05 So reading this book made me realize I have zero persuasion skills. I'm not actually subconsciously controlling anything. I'm sure everyone is surprised to hear that. But I realize it's a profound skill. And it's also kind of weird to think, okay, maybe this person let's use car salesman, for example. So like this person doesn't want to buy a car and maybe they shouldn't buy a car, but you could talk to them and convince them to buy a car. And now they have a car that maybe they didn't need. It's just kind of weird. So I think it's one of these things that you have to I guess like anything, you have anything, you have to use your code of ethics and use it sparingly.
Starting point is 00:28:47 But it's kind of like a weapon that I didn't really realize that people had. But it talks about how to speak persuasively, how to convince people. And there's different scenarios. There's scenarios where somebody is completely against your position and how do you sort of start with some common ground and build that up. Um, there's scenarios where somebody, um, want some, someone basically wants something, but maybe not enough. So maybe the person wants to buy a car and you're a car salesman and you need to get them across the finish line. Um, a whole bunch of different scenarios. I mean, some of them didn't really apply to me.
Starting point is 00:29:26 There's like dating scenarios and just those days are over. Things like selling things that probably never will happen. But there are also, you know, how to give persuasive presentations. They also go over a variety of, there's a lot of historical anecdotes. So there was one thing in particular was, I guess, this idea that, you know, there's some stories that are kind of told and told and told, like Aesop's fables, you know, the Bible, or even just, I mean, you know, a lot of these, like, texts have a lot of the same stories in them right um and so there's there's some stories that either everyone knows or enough people know and it's it's kind of like
Starting point is 00:30:15 the idea behind the story has then spread and so it's sort of like a subconscious thing there's people who don't even know they know. But if you kind of have these certain undertones when you speak and when you write, you can kind of tap into some of these some of these motifs and sort of like the analogy was, it's kind of like when you're riding a current or riding a wave. So you could you could, you know, you could just paddle by yourself, but or you could ride this wave of this sort of narrative that's been going for thousands of years, and you could use that momentum. I thought the whole thing was super interesting. Super shout out to, I don't know if I should mention his name, but a co-worker of mine who recommended that I read these books on persuasion.
Starting point is 00:31:03 He had read them, and he thought they were really good. Well, that's why you ended up reading them because he had already read them and he persuaded me. Oh man, he persuaded me. Oh, I feel so cheated. I wonder if he got a cut or something. Did you click on a link he sent you in chat? This was in person.
Starting point is 00:31:19 But in person, he persuaded me and I'm pretty sure I bought that book either right while we were standing there or the next day That's how good he is That's how good this book is so so Jason has persuaded you to also go read this book Let him know that's right. So if I persuaded you you can go to audible. We'll plug it right now audible trial comm slash programming throwdown and And pick up a copy of this book for free. Well, same for my book.
Starting point is 00:31:49 I will attempt to persuade you that if you would like to not learn about persuasion, but instead learn about a world where magic and guns mix. My book is The Promise of Blood by the book one of the powder mage trilogy which is which is a trilogy i confirmed because i've messed this up uh the trilogy is complete and out so also checked on that uh so i'm really trying to get my ducks in a row because i i've uh not always done so well and kind of knowing how many books are in the various series that I'm reading.
Starting point is 00:32:27 I mean, it's hard. You might not even know every time. And this book, I am listening to it. Oh, I did finish listening to it. It's 19 hours long. And like I said, this is a book where there's actually several kinds of magic. Several, I don't even call them them they're not called wizards there's several different sort of whole forms of magic and different people have access to kind of one
Starting point is 00:32:50 of the forms of magic and they're able to do various things um and there's this book you know some books jump in and try to explain everything that's happened in the world and get into it this one sort of just jumps into the middle of things. And for a while, you're kind of just left trying to figure out what the various things they're talking about are. But I thought it was an interesting departure. I haven't read a book in a while. The other one I've recommended, the Lightbringer series, also has guns in it.
Starting point is 00:33:21 But it's kind of an unusual thing. You don't often see sort of both magic and guns and so so is it is it is it just guns or is it is it is it more broad like there's there's there's you know is it like steampunk plus magic kind of yeah or is it just like total medieval fantasy no no no definitely kind of like i would say yeah like steampunk magic that's a good call yeah okay so there are some references to steam engines but i people don't go they're still riding around in carriages so it's not like medieval per se yeah right and this one is interesting because not only does it kind of
Starting point is 00:33:56 have just guns in it but actually gunpowder itself is uh used in sort of kind some some of the kinds of magic like it's a kind of magical substance. But as far as I hear it, they just... It reminds me of Full Metal Alchemist. Oh, I have not... Oh, maybe it is. Maybe it's like, maybe they just lifted that as an idea. I don't know. I don't know what Full Metal Alchemist is. I'm sure that
Starting point is 00:34:17 that trope has been around. But yeah, that's cool. But yeah, I thought it was pretty interesting. Again, I don't want to spoil so much of the plot. But it wasn't... I personally didn't find this an especially deep read or a particularly uh this sounds really dumb a realistic magic system like sometimes when no i mean one that like fits newtonian law in a sense like you shoot ice and and then ice could turn into water or something oh sure yeah that'd be a good example. Yeah, like, oh, I make magic ice, and therefore it doesn't melt.
Starting point is 00:34:46 And it's like, oh, come on now. Yeah. But not referencing any Disney movies. Yeah, I guess, like you said, it kind of has what seems to be a little bit plausible. But yeah, I found this one to be good and interesting. But to me, some magic systems are explored in the sort of socio-economic impact that it would have like oh if you had this skill
Starting point is 00:35:11 you know you would enter into this kind of trade or you know you would be desired because you'd be really good at these kinds of tasks so like in a world where there were magicians who could remove friction between two objects like you would never use oils or bearings if that magic was common because you would just have magicians decrease the friction of the thing you need um yeah i'm not that's not in this i'm just making up a imagine example it's like oh okay this is sort of a i would consider that kind of like a realistic or a you know kind of hard magic but where you're trying to kind of really think through the implications of everything, not just have like a good story where there's magic.
Starting point is 00:35:49 Yeah, it's kind of like in, you know, the Final Fantasy video games, it is always interesting how, you know, everyone has access to a life spell, but then the major characters will die as part of the plot. And it's kind of like, you really have to suspend disbelief to absorb that. Oh, I never thought about that. Wow, you're right. Yeah, it's just like, why don't you just cast Life? They can die in battle, but then you could bring them back.
Starting point is 00:36:13 But then they could die outside of a battle, and you can't bring them back. Yeah, there was actually one Final Fantasy game. I think it was Final Fantasy III where somebody dies, and the person casts life and then life 2, life 3 and re-raise and none of them work and I thought okay well that's at least an homage to this. Cool, alright
Starting point is 00:36:34 if we didn't persuade you to buy our books you can still help us out by joining our Patreon and being a patron you get a super fast RSS feed. The downloads are much better. And also we, you know, kind of take that money and we figure out ways to get more people to listen to the show, you know, which helps them out. I actually talked to there's a person who
Starting point is 00:36:59 just happened to be in the Bay Area and they reached out to me and they said, hey, you know, did you want to meet up? I said, sure. I said, you know, so we ended up meeting up and I had a really great conversation. And it was it was actually just really touching that, you know, this person was not a CS major or anything like that. Completely different discipline, still in the sciences, but, um, you know, a totally different science. And, um, you know, they, they're listening, you know, they were kind of, um, what's the word they're dabbling in different fields or doing contract work and things like that. Um, but after they listened to our podcast, they realized that, that really what they want to do is, is do is programming. And so they kind of taught themselves.
Starting point is 00:37:49 I don't know exactly. I should have asked what sort of, you know, lessons they took and things like that. But they taught themselves from scratch. And now they're working at a tech company. And they, you know, said a lot of it was because of our podcast, which is awesome. So we're going to try and reach as many people as possible. And that Patreon patronage helps us do that. So I appreciate it. All right.
Starting point is 00:38:17 Well, moving on to tool of the show. Tool of the show. I have two tools of the show. Hey, that's cheating. These are pretty geeky um one is called pulp um and the other is called pi b and b so pulp is uh actually i don't know what the u is but it's python uh linear programming and pi b and b is python branch and bound and so i'll explain what these are it's they're pretty cool. So there's a lot of problems
Starting point is 00:38:48 where it's kind of like solving a maze, right? So you start at the beginning of the maze and you just kind of, there's not like a formula to solve a maze. You just have to kind of try it, right? So you have to just, and you can imagine even if you do this in one of those placemats
Starting point is 00:39:04 they give you at a restaurant or something something you're just kind of drawing the maze that you kind of get stuck you backtrack you get backtracked again it's like okay now i found my way out and there's just a lot of things in computer science that have that phenomena where you're kind of searching right so imagine you're doing, you know, a common one is the traveling salesman problem where you have to, you know, kind of visit different nodes and things like that. I won't get into too many details, but, you know, Sudoku.
Starting point is 00:39:34 Sudoku is one of these problems where you want to put in these numbers and at some point you get stuck. You know, you put in, let's say, two ones in the same row and it's like, that's not allowed. Or you're in a position where you can't solve it without putting two ones in the same row. And so you have to back up, erase some of your numbers, and start over, right? There's many things that fall into this category and usually they have one of two forms. They
Starting point is 00:40:01 either have a linear form, which means you can represent it as a linear equation. So I'll give you an example. Let's say you could have horses and you could have chickens. And the horses you can sell them for $20. The chickens you can sell for $15. But there's other limitations. Maybe you can only have four horses, so on and so forth. So you could describe this as a set of linear equations. So you can say, you know, X is my horse, number of horses, Y is my number of chickens. And you can say, okay, X has to be less than four and X times 20 plus Y times 15 is how much money I'm going to make, right? And you can plug that into a linear solver and it will say, well, it will say infinite
Starting point is 00:40:47 chickens. So you have to have the limit on the chickens. Let's say chickens less than 10 or something. And it would say, oh, well, you know, let's get as many horses as we can and then start filling in chickens. So you could also say the size of the animals and add those together, right? And so as long as you can keep everything sort of linear, like a set of just these linear equations,
Starting point is 00:41:08 then you can use Pulp very, very fast. And so you can represent so many different problems using linear programming. You can actually represent Sudoku using linear programming. So it involves some tricks, and it's cool to read about that. But then there are problems like navigating a maze where it just doesn't make sense. Like you can't really represent that using, you know, an equation. And so for that, you need branch and bound. And so what branch and bound does is it searches, but it does what's called a best first search. So the idea is at every point you
Starting point is 00:41:48 say, okay, I don't know if this, you know, is going to take me to the goal or not. Like I don't have a guarantee of that, but I have a hunch. So maybe, you know, if there's two directions you could go and one direction is heading towards the exit and the other directions you're heading away from the exit, you might say, well, I'm going to head towards the exit because that just generally seems like a good thing to do, right? And that's branch and bound. And so the trick here is if you can get, you can get the linear programming solver and the branch of bound solver really, really, really fast, then you can kind of change the game a bit. Like you can just say, okay, I have this problem.
Starting point is 00:42:35 If I can represent it in this way, then I can take advantage of this solver this person spent so much time on. And that's basically how these work. So there's people who their whole job, their entire job is to figure out how to take some problem and turn it into a linear programming problem. So I was talking with a friend of mine
Starting point is 00:42:55 who works at Uber and they actually model the decision to match drivers to riders using a linear program. So I don't think they use Pulp exactly, but they've been able to take that problem and represent it using LP. So that's just one example. But these are really, really good technologies to learn. They're definitely on the more advanced side,
Starting point is 00:43:24 but these two libraries in particular are in python which makes them pretty accessible and they have good tutorials and things like that mine is not nearly so geeky i'm on theme i'm on theme though mine is uh an ios app i'm sure there are probably equivalent kinds of apps for android but this one on ios is really nice i've used this one personally and it's called halide and what this is is an alternative to the built-in um application the camera application on the iphones and oh okay the built-in uh camera application is really good i mean i like it the iphones takes good pictures i think almost all like i used to call like top of the line phones take really good pictures now um but there's a fair amount of i don't know so like guesswork that they do when you take a picture
Starting point is 00:44:14 they may you have to make a lot of assumptions right because there's a lot of trade-offs and uh and taking the pictures and you don't want to bombard you want to whip out your phone and be taking the picture within a like second or two and if you had to configure 13 different parameters well that's not really good um yeah and so the default apps you know that that's sort of their prioritization 100 makes sense um but it turns out like the sensors have been getting better and better and so there's actually a lot of whatever you want to call latitude that you can do. And this highlight application is one you can bring up instead of the normal one where you are able to take raw images from the camera. So the camera API support giving this sort of full bit depth image, at least on iPhone.
Starting point is 00:45:07 I'm sure something, again, exists on Android. But on iPhone, it allows you to take the full raw images. And when you take them and you look at the picture that's been taken using the Halide app, it looks kind of bland. It doesn't look like the colors don't really pop. It doesn't look that good. But it's okay. I'll spend a little bit talking about this because it's actually on topic um that when the image is when the image is taken that uh a normal jpeg is uh has what we call like a bit depth of eight bits which is there are at
Starting point is 00:45:38 each location in the image there's a red green and blue value and that can be 0 to 255 8 bits for each of the three but in reality the image sensors can sense more color than that they can often sense 10 bits which is four times more colors than that than 8 bits or sometimes even 12 bits which would be 16 times more than than the 8 bits and the problem though is the sort of common image formats people's displays. Those are all almost universally 8 bits and Or sometimes you hear called 24 bit color, but that's not 24 bits per color. It's 24 bits total 8 red 8 green 8 blue and then sometimes in computer graphics you'll hear 8 bits for alpha as well so to make a 32-bit number rgba um and so these rgb if you hear 24-bit color which is what most screens kind of do that's eight bits per color channel and so this one can do more
Starting point is 00:46:38 but it can't display all those 12 bits at once so you have to if you effectively think about like a number line the number line is really big you either have to sort of normalize it down where you get what you want to call like quantization errors where two if you're only one bit apart in the original image you just collapse to the same number right so you collapse the entire range down to a smaller range but you can also do which is if you've ever seen this sort of high dynamic range imaging kind of stuff, you can also bring down various parts like, oh, they're really, really bright pixels, I want to bring those down. But I want to leave all the rest of the image the same. And so if you imagine taking a picture like of a forest and a sunset, the forest
Starting point is 00:47:21 would probably be in like shadow, but the sunset is going to be very bright so what you could do is say oh i want to raise up the dark pixels and then move the whole image to just be more in the brighter range um and it allows you some more it makes some more flexibility there because you're preserving the whole bit depth and when you take your um when you use the default camera app it's making those decisions for you It's trying to do and they're really smart at it now. So they'll do things like actually take multiple images and stitch them together, use some machine learning that they've done to kind of like figure out what are the dark parts of the scene, the light parts, how best to combine them for an aesthetically pleasing
Starting point is 00:47:58 image and to make kind of beautiful colors. But it ends up being kind of your pictures look like everyone else who's taking a picture of the same scene because everyone's roughly doing the same thing instead if you're like oh i really care a lot about the clouds in this picture or i really care a lot about the textures of the trees and i don't care about the clouds you want to make your kind of own decision then having an app like this highlight one and this raw imagery allows you to if you ever kind of played with the sliders and editing the photos and you notice like, oh, if I make it too bright, it looks really bad.
Starting point is 00:48:30 I make it too dark. You get the graininess to the photo very quickly. This allows you to kind of have a little more room to the slider before you start getting that really nasty graininess pop in. But it's a more manual effort. But if you ever use, so like I'm not very good at it, but I do have a digital SLR camera, I kind of know how to shoot it in manual mode. I understand
Starting point is 00:48:54 the complexities, I know how to edit it in sort of Lightroom on my computer. This brings back to me without that full setup, some amount of the pleasure that comes from that flow of being able to make very artistic or more realistic to kind of how i remember a scene pictures rather than just sort of the standard flat image because if you ever take a picture of a sunset on your camera you're like it's beautiful let me take a picture and then you look at later like oh this is like some boring yeah this doesn't look right right that's because your eye doesn't perceive it the same way but you can sort of bring that back um and you're trying to make it look like you remember it um and so i do you think that is there a lot of variance there but among people or do you think that i'm sure they're probably eventually the the camera will just will just do everything automatically i mean
Starting point is 00:49:41 i think it's hard from the camera's perspective because without them asking you, which is, so to be fair, this application is a little harder to use because you can do things like I want to set where my focus is versus if you bring the camera out and just take a picture, it's going to try to get as much of the scene in focus as possible because it doesn't know any better. It doesn't know what you're trying to take a picture of that bear in the distance, the person in the front, the clouds of a beautiful sunset. Like it doesn't know what you're trying to take a picture of that bear in the distance, the person in the front, the clouds of a beautiful sunset like it doesn't know. So that's why you'll see things like they'll now put boxes around people's face under the assumption that if a person's in your image and close, you're probably wanting them in focus. Yeah, that makes sense. And so you'll make they'll make all these hacks.
Starting point is 00:50:23 And so I don't think they'll ever converge because it requires way too much iterative back and forth process, right? But this highlight app, it sort of re-exposes a lot of those knobs to you if you want to fiddle with them. So it's a much slower picture-taking experience. But I also sometimes shoot like old-school film. And that's an even much slower process with a very long feedback cycle before you sort of can develop the images and see what looks like you literally have a dark room and you you do
Starting point is 00:50:51 yeah yeah yeah wow that's amazing yeah i never did it growing up i just like it part to me it's just the chemistry of it is really cool uh i guess i'll have to show this one point yeah i actually did one step further where i did um like silver chloride raw chemicals to make my own sort of uh paper that was photosensitive and like exposed into yeah yeah just because i was like i want to know how this works that is unbelievable but i'm by no means good at it or sort of consider myself an artist but no but it's amazing it's sort of fun do that it's fun to me like i don't know yeah yeah i mean you just built something uh that everyone else is just dependent on yeah so i like know how
Starting point is 00:51:30 to make pictures the way they did like a hundred years ago for you know if the whole world has armageddon or whatever happens and everything goes yeah away and like if we go into nuclear winter i'm going to your house rebuild the world from scratch i guess i'll bring the photography equipment it'll be great yeah i'll bring the dog food you bring the photography uh photography equipment will be all set so um but yeah anyway so how light how light is the name of the app uh check it out you you would want to pair it with you know some ability to edit these files it's really good at taking them um but you need something which i can't i can only share one tool to show unlike jason who cheats um there are many but you know you also want one that does good editing yep that makes sense like but you wouldn't edit on the phone would you uh i do yeah oh wow yeah i mean you could export onto computer but look that to me is a different workflow if i'm
Starting point is 00:52:20 on my computer i probably might as well just have taken it with the SLR. Yeah, that's a good point. Because the image chip is so much better. Okay, let's on to image processing. Processing. I could get started here. So, you know, I think one thing a lot of people don't realize is just physically what's kind of happening, you know, when you take an image, right? So, you know, I mean, everyone knows about, you know, I'm sure everyone knows about photons. You have the sun, it's a big ball of light. It's sending out tons of photon energy. Those photons are bouncing, you know,
Starting point is 00:52:58 all over the place. You can have, you know, light bulbs doing the same thing. And some of these photons bounce and hit you in the eyeball. And so what happens is you have these photoreceptors that, let's say, integrate or accumulate these photons. And so if more of the photons are hitting these receptors at a faster rate, then they are just more excited, right? And so that ends up turning into some electrical charge which then your brain kind of post processes and turns into an image right so a camera doesn't actually work that differently the way the the camera works is photons are also coming in and hitting the camera um uh photoreceptors and then there's these uh kind of like wells think of it as like little u's
Starting point is 00:53:48 and and photons are coming in from the top and they're hitting this u and because it's kind of a u shape it's sort of trapping the photons almost like a crab uh crab trap or something these photos are getting kind of trapped in this well and and statistically it's just very difficult for them to be able to bounce out. And so then what you can do is you can measure basically how many photons are trapped there. So that's at a really high level, you know, what's kind of going on. So now you have sort of the, I guess, it's probably even more raw than the raw format in the iPhone it was extremely raw just this is like the amount of energy that I've captured by with this photo receptor and then typically what you're going to do is you're going to you know kind of turn that
Starting point is 00:54:36 into you know a more simple kind of integer you know discretized. So you're going to, you know, take these, you know, quantities of energy, this 2D field of energy, and you're going to say, okay, this spot right here is relatively really bright, or you maybe have some absolute point of reference. You say, okay, I want that to be, you know, 1.0, or let's so say 255 if you're doing You know 8-bit and then say okay this point right here. There's very little energy So it's gonna be my zero everything else gonna be somewhere in between and boom now. I have some image That's so now so now that's that's sort of the raw image But there's you know a lot of regularity in that I mean mean, there might be, let's say, a wall, which is completely white.
Starting point is 00:55:33 So if we have to store 255, 255, 255, 255 next to each other, that's going to take up a lot of space, but there's not really a lot of information there. I mean, imagine having a huge file on your computer that's just all zeros because you haven't filled any of it in yet I mean it's kind of a waste so there's a whole bunch of different compression techniques they take into account you know it could be some of them are you know draw inspiration from the way files are compressed and things like that but they also you take into account the fact that your eye you know is limited in how much it can perceive
Starting point is 00:56:06 and what sort of irregularities are really important. And so they'll actually use understanding of the human eye and the human brain to make compression very specific for images. That won't keep all of the values exactly intact, but will keep most of them and we'll make the images way, way, way smaller. And you can imagine doing something rather similar for video. So that's basically the high level of what's going on in image processing. So do you want to jump into how to actually process these images?
Starting point is 00:56:42 Yeah. So I think we're going to go kind of fast because I think this is a broad range of topic. We might come back and revisit some of these. Yeah, if people are interested, let us know. If there's something that you want more depth on, just shoot us an email or ping us on Discord or something. Yeah, I'm pretty sure Jason just glossed over several textbooks worth of... Yeah.
Starting point is 00:57:07 So once you have you know some in some digital form you have the data that's been captured from your your photo sensor or you have a file that's been stored and you've recovered it you've loaded it back up again um you get into this sort of world of image processing there we were talking about you know at the in my tool of the show we were talking about sort of raw image versus already having it in eight bits what is that process i mean all that is part of image processing how do you do white balance so it turns out you know our eyes see white that if you measure it in a sort of strictly photon color energy sense is not white the same white in every case um and so but cameras are more sensitive to that and so they'll have a tint where certain scenes will be green or yellow um and how would you adjust it to make it look like i was sort of
Starting point is 00:57:59 expressing before um one of the things is like you want it to look like the person thinks it looked like so if you just showed hey here's the scientifically measured thing the camera gave people are gonna go oh there's a really crappy picture like why is my picture outside you know really red well there was a lot of ir content from the sun people are going to be upset right and be like well but it wasn't red and so there's all these techniques for like Jason was pointing out, just like with compression for even just other things where you want to adjust how a picture aesthetically looks so that it's what people sort of perceive, not just what is scientifically measured. And so image processing, we're going to talk about other things, but in part is about understanding how to reconcile the real world to the what you call that. Is that physiology? The sort of like how human brains perceive things um yeah yeah totally and so that's one thing another thing and and all these things intertwine and use each other but at a high level you get this sort of where you want to do filtering on an image so people will talk about
Starting point is 00:59:01 i need to sharpen an image or um you can give me things like, I wanna find all the edges in the image. So Jason talked about for compression. Maybe you wanna say, hey, I wanna preserve the edges, but in areas where there's no edges, maybe I just like make it all a single color. Not a great idea, but an idea. And so you wanna find, you know, edges in an image, or you wanna, what if you wanted to blur the background
Starting point is 00:59:23 of an image, but not the foreground? These are common operations people sort of think about doing and some of them have higher level goals of why you would do those filtering techniques but there's a couple ways of sort of accomplishing them one of them is applying you could call them sort of kernels they're sort of like a set of values where you want to kind of do an element wise multiplication and then sum it up and a little hard to explain, I guess, with words over a podcast and not spend a ton of time. But roughly you're taking at each physical location on the sensor, you have the values that were recorded there after some processing and you want to go
Starting point is 01:00:05 through each of those locations but as you move if you think about sort of moving along the row moving through columns in a row you're sort of moving left to right in the image which is a representation of left and right in the real world when you move down columns then you're moving down in the real world right you're you're sort of moving in the same spatial sense as the picture itself was and so for instance if i blurred the first 50 percent of rows but not the last 50 percent of rows the top of the scene would be blurred but the bottom wouldn't be um doing those kinds of processing are things that take place in the spatial domain and there's a whole slew of various
Starting point is 01:00:45 techniques for doing things uh like we're talking about sort of like edges exist in the in the spatial they're like you know you can sort of say hey if i have that something was all one color and then it changed to another color that's an edge and i could sort of you know try to detect that spatially but that's not the only way of doing things and in fact when you get to some of these images and cameras like on the the phones now can be you know many megapixels 10 megapixels i don't actually know what the current phones are yeah it's about 10 at least yeah but then high-end slrs can get much higher and scientific application images could get even much much bigger um and so as the megapixel count goes, well, just as the pixel count goes up, every time you're moving in this sort of spatial sense,
Starting point is 01:01:29 you're slowing further and further down. And so you get a lot of GPU involvement in doing this processing. But sometimes you wanna do hacks by not moving through every pixel location. And one of the things you can do is switch to the frequency domain and we've talked in the past about the foyer transform and roughly how that works i believe we have but once you
Starting point is 01:01:51 convert from the spatial domain into the frequency domain there's a whole different i don't want to say language isn't the right word but a whole different technique for how to apply the filters so now we talk about things like if i removed all of jason was talking about before if you have an area that's relatively uniform you could just sort of like you know make that all a single color well what he's saying is if you sort of reinterpret it he said an area with you know sort of a homogenousness to it like a sort of consistency that would be a lot of low frequency information, the information isn't changing very rapidly in that area, versus an area like I talked
Starting point is 01:02:30 about with edge detection, those are areas with high frequency content, the something is changing very rapidly in that area. So I guess they're a duality there. And so sometimes things are easier to do in the frequency domain so if I go into an image and I sort of did a high pass filter that is I removed a lot of the low frequency content and just you know set those frequencies to zero you at certain frequencies probably wouldn't even really notice that but it's less information I would need to store and so I would achieve some form of compression image compression by reducing the amount I needed to store. But then also I would cause if we sort of revisit, if we go back from the frequency domain into the spatial domain,
Starting point is 01:03:14 which is what I would show on my computer screen after doing that high pass filter, what you would notice is the image would look different in this case areas with low slow changing values would sort of smooth out and just become become consistent you could also do the reverse you could do a low pass filter and get rid of high frequency contact and that roughly kind of works as like a blur where you know things get averaged out over an area because the frequencies are not the the values aren't able to change very quickly. They can only change over long distances. Um, yeah, I think the one thing about the frequency domain, that's, that's, you know, that's kind of hard to grasp and I'll try to explain it over the, the, over audio as best as I can. But, you know, it's really hard
Starting point is 01:04:03 to understand, like, what does, what do you sort of get when you convert to the frequency domain? And so I can talk about, like, let's say, like a cosine transform. So, you know, let's say you have an image. So what is that at the end of the day? Let's just take a black and white image or a grayscale image. So you have basically a lattice or a field of numbers and so in the top right you have let's say 255 and that means that in that area of the picture there was something really bright and so you can imagine looking at at some picture and and seeing and breaking that picture down into pixels and each pixel having some brightness so that seems pretty intuitive, right?
Starting point is 01:04:46 But then what's happening with the frequency domain, you still have an image. So you still have a field of pixels, but those pixels aren't arranged spatially and they don't really represent the same thing, right? So you have this whole sort of geometry of this photo. Right. But you could have a 2D field of anything. So in other words, let's say you watch television and there's different channels on the remote.
Starting point is 01:05:17 So there's channel one, channel two, channel three. And let's say you were to just put a number based on how much you enjoy that channel. OK, so you really love channel 24 and you really love channel 32 and 58. Right. So you can imagine just having this array and having just a set of numbers there. And you can imagine even if it's kind of a weird thing to think about. But if well, there are some there's some TVs that have sort of almost like I guess two dimensions of channels in a sense so you can have like a channel two and then a 2-2 and so now you have this sort of this sort of also this lattice or
Starting point is 01:05:56 this field where you have channel 1 2 3 4 and then going down that's going across and then going down you have channel you know 1-1 12, 1-3, 1-4, almost like an Excel spreadsheet of channels. And at each one, you have some number for how much you like the channel. So pretty much anything can be represented in, let's say, a spreadsheet or matrix or lattice, right? So what's going on with the frequency domain is at every single location what you have is a specific description of wave, right? Because you can describe a wave by the sort of, what is it, the amplitude and the wavelength, right? So
Starting point is 01:06:43 you have these two numbers that you can vary. And as you pick different amplitudes and wavelengths, you get different waves, right? And so at every point in this lattice or in this Excel spreadsheet or whatever analogy you want to use, at every point there is one of these waves, right? And so then the question is, okay, you know, how important is this one wave in this image? So if I take all of these waves and I give them different strengths and then I add them all together, I'll actually reconstruct that original image, right? But at any given pixel, every pixel is representing the strength of an entire wave that's making its way from left to right across the whole image.
Starting point is 01:07:32 And so that's something that's kind of really hard to understand, but it's really foundational because when you realize that now every pixel has an entire row's worth of information, that's the jumping point from which you can now do some really, really clever things, like operate on a whole row just by changing a pixel. I think that was a pretty decent pass at doing that without illustrations. I tried my best.
Starting point is 01:07:59 So just trying to speed it up a little so we don't get this too long. That's kind of various approaches to filtering. You also get roughly things which I just kind of called computer vision, which is not just filtering or aesthetic improvements, but you might use some of those filtering things to perform image classification or recognition. I think a lot of people think about this as the what you might call like the stereotypical image processing case, like, how does a computer know a picture of a cat is a cat? Yeah, that's crazy hard. And there's many, many different ways of doing it, whether you're recognizing something you've seen before, or recognizing other things. There's also tracking. So I have some object in an image, and I maybe I don't know exactly what it is but I want to track how it moves over time is another task that becomes using a lot of these
Starting point is 01:08:54 techniques of filtering and you know cropping down the image and looking for changes over time something that you can do another broad class that I'll say here, maybe not the best word for it, but the word that came to mind for me was, oh man, I'm not even going to probably say it correctly, but it's photogrammetry, which is... Photogrammetry? Yeah. So what I was trying to say, so this word, which I had seen written, but I don't think I've ever heard anyone in real life use this word, is like using photos to measure things. So, you know, in the classic crime scene photo, you have like they lay down the ruler next to the shell casing or the puddle of blood and then they take a picture of it. So later they can, you know, compare the lengths on the ruler to how big the puddle was or how big the shoe print was and so just
Starting point is 01:09:46 making measurements off of photos but what i meant by this is a probably butchering of the use but a lot of things where you get you know i have uh i strap a camera to the front of my bicycle and then i take a picture i you know one second later take another picture and if i compare those two pictures i can see things that have moved um and then i can begin to understand the shape of the world around me and that that's called sort of structure from motion um with a single camera you put two cameras next to each other and then you can begin to try to reconstruct depth so when you have two cameras next to each other just like your two eyes you get an effect called parallax and based on how much something is experiencing parallax you have an estimate for how close it is
Starting point is 01:10:29 so things closer to you have more parallax you can try it hold your finger close close one eye close the other eye it moves a lot hold it further out moves less stars in the night sky don't move at all when you close one eye or another eye um And so measuring this parallax, you can get an estimate of how far things are from you in the world. That's a very interesting thing. You'll see these projects where people do automatic stitching of camera photos where I take a picture
Starting point is 01:10:56 and then I move my camera a little and take another picture and later they can sort of stitch it together and make a panorama. To me, that's not exactly photogrammetry, but it kind of falls in sort of the same thing where you're trying to not identify what something is, but identify features in an image and then try to track those features over time or match those features for sort of building up higher level understanding, um, from lower level, uh,
Starting point is 01:11:21 features. Yeah, totally. I mean, there's just an unbelievable amount of depth that we could go into here, but it would take just hours and hours and hours. I do think that there's a lot of super interesting topics, a variety of different methods for doing object detection, object recognition, tracking. We can't cover all of them. If there's something in particular
Starting point is 01:11:45 that people are interested in you know we'd be more than happy to cover it we've both done a decent amount of image processing in our career so um you know it would be it would be definitely something that we'd be happy to revisit but uh yeah hopefully this gives people an overview of kind of you know the tips and tricks and what's really happening when you you know snap the photo on that um yeah like jason said i mean i think we could take this any number of things i actually think we probably will like i know jason you have a lot of experience with sort of the machine learning aspects and probably talk a little bit to how machine learning and image processing come in to interact with each other yeah Yeah, totally. We could do like a part two.
Starting point is 01:12:25 Yeah, I think that'd be really good. And I do want to mention that the like Swiss army knife of machine vision is OpenCV. So I mean, I think I've, I don't even know how many years now I've used OpenCV at various points, one thing or another, but almost anything you would want to do in computer vision they probably have a function a function or implementation of it in opencv
Starting point is 01:12:57 for just all sorts of things and so it also involves it also has abilities to show images and open images and it's not the only thing for doing that but it does have many many advanced algorithms as well as a lot of basic algorithms. And that's available, I know, for both Python and C++, and I'm sure has bindings for other languages. Yeah, OpenCV, if you're willing to invest the time, is amazing. I mean, it's got everything. It's insanely fast. They've all these sort of specializations for if the image is floating point, if it's different formats of images,
Starting point is 01:13:26 it handles them in the most efficient way possible and all of that. Two things, two libraries that are a little more for beginners. There's Python image library, which is quite good. You can do some basic things like edge detection, dilation, things like that. There's also SciPy ND image, which is really interesting. I don't know why they made this such a focus, but it supports arbitrary dimensional images. So you can have a 27-dimensional image. The only thing I've ever found that useful for is for videos because you can think of a video as a cube where it's just a set of images that are kind of stacked up in a third dimension. But if you really are going to be serious about image processing, grabbing a good book on OpenCV or diving into the tutorials is definitely the way to go.
Starting point is 01:14:30 That library is fantastic. Just a random bit of trivia. basically productionized at willow garage which is uh um just like a hardware hacking place that shut down uh i guess about four or five years ago but it was pretty famous for doing all sorts of crazy robotics all right well till next time yeah um it's it's been great as i said you know there's been there was a chance to kind of meet with some fans and things like that, which was awesome. Keep writing in, keep giving us your ideas, your show, show ideas and other things. We'd love to hear from you. Our next episode, you know, think about interview episodes. You never, it's never a guarantee.
Starting point is 01:15:18 You know, it could always fall through, but fingers crossed. You know, it should be fine. We're gonna have a really cool interview coming up next month really excited to get to it and we'll catch you all then Throwdown is distributed under a Creative Commons Attribution Share Alike 2.0 license. You're free to share, copy, distribute, transmit the work, to remix, adapt the work, but you must provide attribution to Patrick and I and share alike in kind.

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.