Python Bytes - #243 Django unicorns and multi-region PostgreSQL

Episode Date: July 21, 2021

Topics covered in this episode: MongoDB 5 Python 3.11: Enhanced error locations in tracebacks fly.io multi-region PostgreSQL and last mile Redis django-unicorn Blue: The somewhat less uncompromisin...g code formatter than black Organize and Index Your Screenshots (OCR) on macOS Extras Joke See the full show notes for this episode on the website at pythonbytes.fm/243

Transcript
Discussion (0)
Starting point is 00:00:00 Hello and welcome to Python Bytes, where we deliver Python news and headlines directly to your earbuds. This is episode 243, recorded July 21st, 2021. And I'm Brian Ocken. And I'm Michael Kennedy. And I'm Simon Wilson. Welcome, Simon. Thanks for agreeing to show up today. No problem at all. I've been looking forward to this. If anybody doesn't know who you are, can we do a quick, who's Simon?
Starting point is 00:00:23 Sure. So yeah, my name's Simon Willison. I've been doing Python bits and pieces for around about 20 years now. So I'm a co-creator of the Django web framework from many, many years ago. I think Django is definitely celebrated its 15th birthday now. But more recently, I've been working on a set of open source tools around this project I have called Dataset, which is a web application for exploring a relational database, a SQLite database, but it also has tools for publishing those databases online, building those databases out of lots of different
Starting point is 00:00:54 sources of data. I'm trying to bootstrap an entire ecosystem of data and analytics tooling around SQLite because it turns out everyone in the world has SQLite, even though they don't necessarily know that they have it. And there's some really cool stuff that you can do with it. Yeah, it's a really cool project. Yeah, it is. If you wanted to create your own personal search engine that would let you just go and say, search your Gmail, your Twitter, your Instagram, and your file system all at once.
Starting point is 00:01:19 Yep. That's pretty much it, right? That's part of the tooling. Yeah, there's a whole side of it, which I've called DogSheep for ridiculous reasons. But the DogSheep project is about personal analytics. It's about getting your personal tweets and messages and all of the personal data about yourself into one place. So you've got essentially a little mini data warehouse on your laptop that you can use
Starting point is 00:01:39 to query aspects of your own life. And that's been a really fun way of driving features in the software, which can then be applied to like company databases and so forth as well. Yeah, super cool. Well, if I didn't want to do SQLite, I might want to use Mongo. What do you think?
Starting point is 00:01:55 You may want to. And so there's some big news around MongoDB. MongoDB 5 is out, which, you know, I'm all about MongoDB, which makes me super excited. Probably won't switch right right away because I don't actually need the features that are there, but I'm super excited to see things going strong.
Starting point is 00:02:13 So some of the things that are relevant, and I think they're really relevant to Python people, especially the data science side. So basically there's two important things. One has to do with working with time series and the other has to do with stability of the app that you don't want to keep changing so that you can upgrade your database, right? Like if the database API slightly changes, you don't want to have to deal with those incompatibilities until you're ready to take advantage of the benefits of making those changes. So one of the things that comes with is in the database that are native time series, schemas and collection types. That's incredible. Yeah. So you can do really interesting things like a moving average as a query across like data and stored
Starting point is 00:02:58 data in a format that's meant to make that incredibly fast and low latency. But you can also do like, I would like the numerical derivative over time as a moving average, as a query, or the integral of this collection. So you can do like math as part of your query and get it to calculate those things in really interesting ways. So the time series has things like clustered indexes
Starting point is 00:03:21 and window functions and all sorts of interesting things. So that's one. It automatically optimizes your schema for high-efficient storage, which is pretty cool. That's, I think, independent of the time series, but not 100% sure. It has, the other big thing is the versioned API for future-proof apps. So suppose you build against version, I guess 5 is the one that has it. Do you build against version 5 of MongoDB? And then eventually some point like version seven comes along and like, oh, you can do this new way of querying, but it's going to break some stuff. So you want to use it. You got
Starting point is 00:03:53 to fix your app. You can just say, I want the database to look like version five forever. And no matter what version is in production, it'll, it'll behave the right way according to what you said you wanted it to behave right. So you could say, I want version seven to be like five for me, but it can be version seven for someone else, that kind of thing. Yeah. The other thing, the way that you talk to it, the way that you interact with it is through just a terminal app. You fire it up or a command prompt app and you talk to it. And traditionally this thing has been gross. It's been like, it's fine, but it has zero syntax highlighting. It has zero
Starting point is 00:04:26 autocomplete, those types of things, right? So they're introducing a new shell. So traditionally you would have typed Mongo, enter, connected. Now you type Mongo SH because the old one is still there for compatibility reasons. But that one now has syntax highlighting, better error checking, pretty printing, autocomplete, things like that. So if you're going to do stuff on the shell, then you really should just run the new one. That's pretty cool. I'm going to go with Mongosh as the...
Starting point is 00:04:52 Mongosh. Oh, Mongosh, what are you doing? Yeah, I'm running the shell, the new one. I know, that's pretty awesome. And then also they're talking about having serverless instances. So like Lambda-type functions where you don't actually have to manage the database or things like that.
Starting point is 00:05:10 So I didn't know a whole lot about it. You can also watch the keynote and actually the whole conference. The keynote is probably most relevant here. Turns out that it's for a public billion-dollar company or whatever they're worth. It's incredibly amateurish and more like more like
Starting point is 00:05:25 a talent fair of like a high school or something like that but whatever you'll still learn i mean it's like you'll see it's it's like super i have to check it out now yeah it's like worth watching for the like the the um the blush worthy like oh it's you oh oh come on okay well let's just move on now please but nonetheless you do they do uh demo some interesting things and whatnot so that's probably enough on that but if you're into mongodb mongodb5 has a lot of cool things to talk about there you know what else is cool and coming up python 311 we don't even have python 310 yet yet. So, well, I do. The beta is available for 3.10.
Starting point is 00:06:07 You can run it. But the alpha is around for 3.11, which is neat. Nice. And what I wanted to highlight here was, highlight, was enhanced error locations in tracebacks. I'm so excited about this. This is so cool.
Starting point is 00:06:23 So, I mean, Python's not been that bad for tracebacks. I've dealt with worse tracebacks. I'm so excited about this. This is so cool. So, I mean, Python's not been that bad for tracebacks. I've dealt with worse tracebacks, but it points out what line is going on. But sometimes there's like weird stuff, like none, not dereferenceable or something. And you don't know what's going on. But now in 3.11, it will point to exactly what part of the line has the error
Starting point is 00:06:44 with little carrots underneath pointing exactly where it's at. That is actually super cool. So like the example you got on the screen here on the announcement, you've got multiple objects accessing their fields, like 0.1.x, 0.2.x. And the error is none type object has no attribute x, which is probably the most common error that you'll ever find in Python. But what I like about it that you're pointing out here is like the second object is the one that is none. And it actually highlights, no, no, not the first one, the second one, because there's nothing about the error message that would tell you which of these two things was the problem. Yeah, that's awesome.
Starting point is 00:07:23 Yeah. And it's it's deep into the, so if you have a deep stack trace, it'll show you exactly where into it. And even like there's another example where it shows like a deep into a dictionary. A four level deep dictionary dereference or something, right? And it points out exactly which index
Starting point is 00:07:41 is the one that's messing up. So that's pretty amazing. Also, even math, arithmetic expressions, like a division by zero, you've got multiple divisions, which one is the problem? And it'll show you exactly which one it is. The thing I love about this change is this is one of those things, this is absurdly difficult. This is like acres of computer science and a bunch of people working together on this for I couldn't even imagine how long it took them to make something which is just a beautiful little incremental improvement to our lives as Python developers. But the release notes actually talk about some of the internal changes they had to make to get this to work. This is like really
Starting point is 00:08:20 deep stuff and it's totally worth it for what you get out of it. But I think it's easy to look at this and think, okay, that's a reasonably sensible small change. And this was not a small change at all. And I think it's going to dramatically increase the on-ramping of new people into Python because being able to figure out what's wrong with your code, that's basics. I mean, some of us old hatters are used to digging into confusing tracebacks, but some new people are not. So if we can make them less confusing, that'll be great. Right. When I work with new programmers, it's so common. They get a traceback and they freeze because this utter, utter meaningless junk has just shown up on their
Starting point is 00:08:59 screen. And what are they supposed to do with that? And here it feels like this is just such a huge improvement because at least it's pointing to the bit in the giant blob of text that they should be paying attention to. Yeah, lovely. I want it in 3.10 though, but we have to wait till 3.11. From futures, import nice stack trace or trace back.
Starting point is 00:09:17 Yeah, very cool. All right. So Simon, you got the third one. Tell us all about it. Okay. So Fly.io are a hosting provider who i've been thought they launched about a year ago i've been following along because they're doing some really interesting stuff around hosting docker containers and all my stuff is in dock containers
Starting point is 00:09:33 so i'm always looking for things where i can throw a docking docking container of the host online their secret source is that they do geographic hosting so you can ask them to run your container in like tokyo and san like Tokyo and San Francisco and London, and they will do that and they will direct traffic to the closest version of that app. So it's this thing, I worked at Eventbrite for many years. And one of the things I was always trying to figure out was, okay, could we run Eventbrite close to our users? Could we have a database in Europe and a database in New York and give people a faster experience that way? Incredibly difficult to do.
Starting point is 00:10:06 Right. What a lot of people do is they do CDNs, so the static content, but then there's one server somewhere that is really the one. It's the database. It's the application code and then it's the database server especially.
Starting point is 00:10:19 And so what Fly.io are doing is making it so much easier to do this that you could start a project and have it geographically distributed from day one without having to think particularly hard about it. I like that about them. This article came out within the last week, I think. It talks about their plan for multi-region databases.
Starting point is 00:10:37 In that case, they're talking about Postgres and this desire to have Postgres databases distributed around the world. And so when you're doing that, having rights to multiple places remains incredibly difficult. But a very common pattern is you say, okay, we're going to have the leads database is in, I don't know, New York, and all of the rights go to that. And then the reads get spread out to a replica database
Starting point is 00:11:03 that's running in different places around the world. And that's still a really difficult thing to set up with the geographic load balancing. So what they propose is basically run your application all the way around the world and set it up so that if anyone tries to write to the database and they're not talking to the leads database server, the error gets caught and the application server replies to fly CDN and says, hey, rerun this request against the leader database in New York. And so the user doesn't see anything at all.
Starting point is 00:11:31 The user attempts to do something, and it works. And what's actually happened is they tried to do a write against Tokyo. Tokyo said, oh, we can't handle writes. Fly invisibly internally redirected to New York. The write happened against New York, and the result came back. And so this takes geographically distributing your database weeds, which used to be... I mean, I was thinking it was going to be a team of engineers for six months to get this working, and it's just baked into their platform.
Starting point is 00:11:57 It's this incredibly elegant piece of systems engineering design that they've done. And I was fascinated. I've banged my head against this problem for so long and they just solved it. You know, they just said, hey, here's a way it will work. We've shipped it. Try it out. As something of an architecture nerd, this really fascinated me.
Starting point is 00:12:16 This is fascinating. Yeah. And I can see just, you know, we've got like the retry decorators and stuff for various Python functions. Like I could see almost a, you know, like retry the right decorator that you put on them. It catches the error and it just goes, Nope, we're going to send it everywhere it goes and then, then return the result. Right.
Starting point is 00:12:35 Like basically put decorators anywhere you're going to ever do a right and you're good to go. Exactly. And in fact, they, they've even got example code for Ruby on Rails. We don't even have to do that. They catch the database error that says, you tried to do a write in a read-only transaction, and they turn that into an HTTP header that replays it against the lead region. And that's it. On the one hand, it's kind of an awful, kludgy hack, but it's also genius. This is taking six months of engineering work and turning it into add these
Starting point is 00:13:05 five lines of code and now your application works all the way around the world. It fascinates me. Yeah, this is pretty interesting. Yeah. Also, there's one other link in the show notes. There's a second article they put out a few days ago, which is just doing something... It's more about using Redis as a cache in your geographical data centers. So you can have a local Redis, because their argument is people in London tend to be interested in other things that people in London are interested, ditto for Tokyo. So actually distributing your cache by city normally gives you really good cache hit rates.
Starting point is 00:13:42 But they also pointed out that, and I didn't know that Redis could do this, Redis can be set up to allow writes to supposedly read-only replicas. So you can have a local cache that you're writing to and reading from, but still have that leader Redis in your main data center that can send writes out to all of those replicas. So that gives you cache invalidation from a central point. You can, in your sort of lead Redis, you can say, okay, everyone delete the cache entry for whatever this thing is. And all of those replicas around the world will then delete that cache entry, even though normally they're acting independently. And yeah, it's again, this is for, if you're a systems architecture design nerd, the stuff that they're doing is so interesting. I think it's interesting. And I'm not one of those, but maybe you are and you didn't realize
Starting point is 00:14:25 you will be next year. You will be next year. Fantastic. Yeah. This is super cool as well. And yeah, it seems really useful, you know, and it's perfectly in line with like, let's take our app and put the logic in multiple places because that person is unlikely to move from Tokyo to Virginia during a session. Right. Once they start in one place, they're going to stay in that place. So the cache would reasonably just have like their local data on that one instance, right? Yeah. Yeah, cool.
Starting point is 00:14:58 But maybe your CDN or not your CDN, your CMS is like generated a page and everybody needs that always to be in sync, right? There's that global data as well. Yeah. So very cool. I like this. Check it out. Indeed. Well, let's talk about unicorns.
Starting point is 00:15:11 I love the unicorns. So unicorns, the magical creature. And Simon, I'm so glad that you're here because we can get your thoughts on this, even if you maybe haven't been like deep down in it. So not too long ago, we talked about HTMX, which I'm still a big fan of HTMX. It's a cool like sprinkling of magic onto JavaScript, these stuff onto your page to make it more interactive. But if you're doing Django, HTMX is very relevant. But there's also this thing called Django unicorn at Django dash unicorn.com. It's a magical full stack framework for Django. So the
Starting point is 00:15:44 idea is that you can create these templates, these interactive templates without going and rewriting everything in like some front end framework like React or something like that. You can skip the JavaScript build tools because, you know, you got a lot less of that. And you can skip a bunch of serializers and just use Django for like the API bits. So you install Unicorn, you create a component. And then at the top of your template, you put load, you know, percent
Starting point is 00:16:09 load Unicorn. And then you can just give it a, one of these names. So for example, here's a little task. Task one is tell people about Unicorn. I can add that as too many. I'll tell people about Unicorn. And you can see like this cool little thing is interacting and it's not refreshing the page, right? It's like a front-end framework type of thing. But the way that you write it is you just put some extra template pieces on there, like Unicorn colon prevent, submit prevent,
Starting point is 00:16:38 and you're going to do this add function instead. And if somebody hits the escape key, we're going to change the value. And that's not JavaScript. Those are just HTML attributes, but they turn into JavaScript, right? Which is very cool. So, and then you just put your regular Django
Starting point is 00:16:54 template business down and off it goes. And it turns it into basically something that's way more front-end framework friendly. Simon, what do you think? So as far as I can tell, the real magic here is that they're using, they're doing the trick
Starting point is 00:17:08 where you render the HTML on the server, in this case, reusing your Django template. And then they send back JSON with a blob of HTML in which you then essentially write into an inner HTML
Starting point is 00:17:18 to update the page. And I love this pattern. Like this is sort of fun. I've always been a big fan of the progressive enhancement method of writing JavaScript, where you get the stuff to more or less work without any JavaScript at all. And then if there's JavaScript, then you get in-page page updates and all of that kind of thing. But there's also one of the problems I've seen with lots of engineering shops that try and do that is that you end up writing your templates twice. You have the Django templates that know how to do something, and then you have
Starting point is 00:17:47 front-end templates using React or Handlebars or whatever that know how to do something, and you have to keep those in sync, which is an enormous waste of time for everyone involved. So what they're doing here then is they're cleaning up that inconsistency for you. You write a Django template. They can use that template in Python code to generate just that fragment of HTML, send that back and have that displayed on the page. So yeah, I think this is a really interesting approach. I've not spent much time with Django Unicorn itself, but it also reminds me a bit of the, I think it's called Hotwire.
Starting point is 00:18:21 The Ruby on Rails community built this very exciting framework, again, against these kinds of principles, just shipping blobs of HTML back and forth. I feel like it's something like the mad rush towards single-page applications over the past 10 years has mostly resulted in applications that load slower and take longer for people to build. And they're so inconsistent, and they make me so crazy. For example, I'll go to like a bank or something and I'll say, all right,
Starting point is 00:18:50 I'm going to run my one password pre-fill the page. And you'll see it fill out the page. And then you try to submit it. It goes, please fill out this field. And there's clearly like an email address or something in there. What do you got to do? Go put a space, delete the space.
Starting point is 00:19:03 So the JavaScript event triggers because they're like, not really, not really HTML. It's all that junk. And it's just like, yeah, you know what I mean? But it turns out, what people actually want is they don't want a full page reload. Like anyone who's getting into single page apps and so on, really, they just don't want that flicker when the browser reloads everything. So using this trick where if JavaScript is available, you update a section of the page using stuff that came back from an AJAX API totally works.
Starting point is 00:19:30 And that feels like the model here and also the Hotwire model from Rails. Exactly. Yeah. So the HTMX, the Hotwire, and this, it's all about, let's not write new stuff. Let's just take the views and the templates already doing their magic. And let's just put the little pieces in there to make them dynamic, which I'm all about this. This is great. What I've missed is why is this a Django thing?
Starting point is 00:19:51 Is it because it uses the Django templates or is that? It looks like it. Yeah. It looks like the magic here is that it's using Django templates. And the view as well. It provides its own views because it needs to provide views that have provided JSON API where you can send it data from a form. It then renders that Django template in Python code and then sends you back the stuff.
Starting point is 00:20:13 So there's two sides to this, right? There's the Python Django view functions they've written, but they've also written a sort of eight kilobytes, I think, of JavaScript that hooks it up on the front end. Cool. Nice. Yeah. Yep. Very neat. So not very much code at all to get your Django to become more dynamic, which is great. Yeah. So, um, are, I don't think unicorns are blue.
Starting point is 00:20:36 I'm not really sure what color unicorns, I feel like they could be any color, like they might be rainbow, but, but this, that actually, that's not a rainbow. It's not a rainbow. I want to talk about blue and I'm, I'm, I think I'm, I think I'm ready, uh, to have tomatoes thrown at me or something for bringing this up. Um, but so blue is, is an alternative to black. Uh, anyway. Um, so I love black. I think Black's awesome. But there are times where you can't use it for specific reasons.
Starting point is 00:21:12 And I'm thinking here basically about the decision that Black made to default to, not a default, but enforce double quotes on strings instead of single quotes. There are some code bases where there's already a standard to use single quotes. And then there's also code bases where there's so many strings that actually have mixed quotes. So you've got single quotes and then double quotes inside. And you know, mine end up mixed sometimes because if I want to put quote something in the actual string, I'll use single quotes on the outside. But if I'm going to say it's a good idea, I'll put double quotes on the outside so I don't have to escape the single quote. You know, like if if you're going to have one of the quotes in the string, then just go with the other one is often something I'll end up doing. Oh, actually, Black does that for you.
Starting point is 00:22:02 If you've got a string with a single quote in a string with a double quote, and that's the one time that Black will use single quotes, which is kind of neat. OK, OK, that for you. If you've got a string with a double quote in, that's the one time that black will use single quotes, which is kind of neat. Okay, that's good. Good to know. I do like that. So if this ticking point is really just the quotes, then maybe try blue. So blue is actually...
Starting point is 00:22:19 I was worried there was going to be a fork of black. It's not a fork. It sort of uh in includes black and it like uh overwrites some of the functionality specifically just a few things so the differences are it defaults to single quote strings um except for except for things places where we love double quotes like uh doc strings and triple quoted strings for some reason those look weird with single quotes so i'm on board with that um it defaults the line lengths to 79 and i don't really care because i always override that to like 120 or something like that um and i like black that
Starting point is 00:22:57 black allows that overriding uh and then the other thing that i didn't even think about which is kind of nice is uh one of the things Black does is takes the hash. Like if you have hash comments on the on your right side of your code, you've got like a block of them. Like like maybe you're talking about an entire block of code. So you have a block of comments. Black will like remove the white space in front of the hash, whereas blue will leave those alone. So you can have block comments on the side um that's really it that's the only difference um and i uh i think having this around is a neat thing uh interesting quote from the doc is that they actually don't want to keep uh keep this project alive very long they'd really like these to just be options in black i don't know how viral
Starting point is 00:23:43 they'll get but i don't think that's going to happen i think black is pretty hardcore guarantee and like they're very into not adding configuration where they can still avoid it yeah um in researching this one of the things i uh somehow missed about black maybe i haven't read the documentation in a long time, but a couple of years ago, it added the ability to have format off and format on. So one of the things, for instance, occasionally, not very often, occasionally, I've got a large chunk of data set up in like a list or dictionary or something i have them aligned with comma alignment like an old style csv table um and black totally like a 1980s c programmer yeah oh sure um but black totally tears that apart but for that you can you can turn formatting off and um i appreciate that that's cool that's a good feature see it does have a little bit of give.
Starting point is 00:24:46 But yeah. Yeah, that's cool. Yeah, very good one. Very good one. What do we got next? Oh, okay. So there's a link in the show notes to this. This is an article that somebody wrote about using Tesseract OCR to build yourself a searchable index of your screenshots.
Starting point is 00:25:08 And I got really excited about this because Tesseract is like, Tesseract's been around since 1995, I think. It started off at Hewlett-Packard. And it's pretty much still the leading light of OCR in the open source space. But I've never managed to get it to work. And I've always wanted OCR that I can just run. And thanks to this article, I can actually use Tesseract now. So I've got a couple of demos here. Can we see this? Yeah. So I grabbed a screenshot just of the random slide from our conversation earlier, and I can run, let's see, I think it's Tesseract
Starting point is 00:25:35 screenshot.png. I'll put it in a file called screenshot dash. You have to tell it the language that you're using because that affects how it does these things and it supports like 70 odd languages i think um and i'm going to say i want that as a txt file and you run it and now if i cat screenshot.txt this is the launch today mongodb 5.0 this is the screenshot i took of our conversation earlier a better example even would be the um i took a screenshot of python documentation just now so i can can run that same command, except I'll do it against Python docs.png. I'll call it pscreenshot. There we go. Okay. And now if I cat this, this is pretty decent OCR against a screenshot of a pile of documentation. The really fun thing though, is that you can say you want it as a PDF file. And if you do that, it will give you a PDF, which is visually identical to the screenshot,
Starting point is 00:26:29 but has selectable text on it. So you can copy and paste out of that PDF. So the chap whose article is linked in the notes, his trick is he has a folder on his computer that he saves screenshots to, and he has a folder on his computer that he saves screenshots to, and he has a automated script that then turns those screenshots into these annotated PDFs, which means that Spotlight on his Mac can now search them. So anything that he drops into that folder, a few seconds later becomes available to global search on his computer. I think that's a really neat trick. I love it. That's great. So yeah, there's so much stuff I want to do with this. Yeah, it was Alexandru Nedelsu, I don't know if I'm pronouncing that correctly, wrote all of this up. But yeah, you can install
Starting point is 00:27:16 it with Homebrew. It's brew install tesseract. There's actually a Python library called PyTesseract, which I thought was doing complicated things with c modules actually if you read the source it's just shelling out to this command so apparently that's the state of the art in in python um ocr is shell out to the tesseract command line tool which i'm perfectly happy to do you know yeah it's neat i really like this you know it's if you've got a bunch of image data and you want to be able to do interesting things with it, like here's a really quick and easy way to do it, right? Right. It's super simple. This article also, I didn't know that you could use the Mac LaunchD, I think. You can add a launch agent,
Starting point is 00:27:57 which automatically runs a script when a file is saved in a certain folder. So in this case, he's got a launch script that runs the test rack OCR stuff. But this is great, right? Now I can automate any folder on my Mac to do basically anything using this system that's built into the operating system that I didn't know how to use. I didn't know you could do that either.
Starting point is 00:28:15 That's great. That's cool. Yeah, that's awesome. I feel like this is right up your alley, Simon, you know, with the data set, the dog sheep, and like, oh, here's this data we got from this, this automation. And yet I just can't dig into it. And now you can. I'm really excited about this. Although, um, so Apple photos, the next version of Mac OS, Apple photos is going to do
Starting point is 00:28:36 OCR on all of your photographs for you. So you can search for text in pictures that you've taken. And, um, if it's anything like the current version of OSX Photos, all of that data is going to be stored in SQLite databases on your computer. I've been having a huge amount of fun building things against my Apple Photos library because they already run machine learning labeling against your photos. They know when you take a photo of a dog and they tag it with dog, and the word dog is in a SQLite database on your computer. So once you've figured that out, you can run SQL queries against photos you've taken and say, show me every photo I've taken of a dog
Starting point is 00:29:12 that was in San Francisco in the month of May. And you get results back, which is crazy interesting. Yeah. That's pretty cool. Yeah, that's super cool. I love the stuff that you're doing with that. Is it just local or are they caching that in their own databases as well? Oh, well, so they synchronize it all.
Starting point is 00:29:32 So if you're using iCloud, your photos are synchronized up to their servers. You take a photo on your phone, it shows up on your computer automatically. But all of it's the actual local data storage is all SQLite database files. Apple are really big into SQLite. So yeah, there are just these files littering your computer with your address book in there and all of your iMessages and all of your photo metadata. It's just sat there waiting for you to dig in and play with it. Nice.
Starting point is 00:29:57 With dataset, probably. Right? Yep. I've got a script called, I'll add it to the show notes. I've got a script called Dog Sheep Photos, which uploads your photos to your own S3 bucket so that you can actually link to them, embed them on webpages. And it extracts all of that SQLite data into a more usable format. So yeah, I've got a online database of all of my photographs
Starting point is 00:30:20 that I update every now and then with the script. And it works. It's phenomenal what you can do with it. Out in the live scene um brandon hey brandon says this is fantastic definitely excited and also taking a step back to yours brian david colton hey david says i'm using double quotes now in black but my typing has not evolved yet to double quotes so you just pass it through the single quote to double quote compiler process called Black and then you got it all adapted. That's nice.
Starting point is 00:30:49 I'd say Black has given me back, I estimate 5% of my program typing time used to be worrying about indentation and such like. And I got all of that back. Like thanks to Black, I never even think about how I indent or style my code at all. I just say, I'll literally
Starting point is 00:31:06 write horrible run on lines that go on for ages and then run black and it formats it nicely and I forget about it. It's wonderful. It's fantastic. That's cool. Yeah, great. Got any extras for us, Michael? You know I do. I always do. Unless I have an extra, extra, extra. You're all about it. Then I guess I still do. So we talked about strong typing last time, which lets you do cool stuff like go and put a decorator onto a function and say, well, this one, you know, if it has type annotations or type information like Python itself just does, if you put at match typing the decorator on there, it'll verify at runtime that you said it took an integer and you actually passed an integer, not a list or whatever to that parameter, right?
Starting point is 00:31:52 Well, Felix, who maintains this project, reached out and said, hey, that actually does a whole lot more. You should check some other things out. I just wanted to highlight a couple of things that he pointed out. One, if we, you know, we're all familiar with the named tuple
Starting point is 00:32:03 and you say the type name in a quote, and then you say the fields or the elements attributes in a list, either space or comma separated, like spell, mana, fact, and so on. So this one has a typed named tuple where you can put the type information in very similar ways to what Python would have, like colon, str, colon, list, and so on. And then you get actual type runtime validation that your data going into your named tuple is actually the type of data you expect in your named tuple. Oh, nice. That's neat. Yeah. Yeah. So there's that. And then also, I love this about our show. It kind of blows my mind that this is how the world works. And I really appreciate this. Everyone who plays along will say things like, oh, I wish we could specify indexes in Beanie.
Starting point is 00:32:50 And then the next episode, we're like, hey, look, Roman added a way to do indexes in Beanie. And I said, this is awesome that it applies to functions, but why couldn't it apply to classes? It's basically the same thing. And so now so now six days ago we have a new feature you could also apply strong typing to classes uh as well or something like that so well done well done is it because you asked for it because i mean i asked for single quotes in black and i didn't get that but well i mean it also may depend on the size of the project. The more input they get, the less influence any individual statement may have on it, right? Yeah. Anyway, Felix, thanks for working on that and the extra information there.
Starting point is 00:33:35 Yeah. Actually, one other thing. Yes. I have finally, I've been working to make sure that we don't have to have one of these completely useless, dreadful talks on technology. Our site uses cookies. Here's our cookie policy.
Starting point is 00:33:51 Do you accept our cookie policy or do you not accept our cookie policy? AKA, would you like our website to work or would you like to go away? Like that's kind of what the button so often means, right? And so I thought I removed all the analytics. I removed anything else that we might be doing third party. We're good.
Starting point is 00:34:08 And I went to Python Bytes and I'm like, wait, there's, there's double click. There's Facebook. There's Google. There's like, what is all this stuff? And we started including the live stream YouTube embed and it started bringing back. And I'm like, why would Google be putting in Facebook? That sucks. And there was also the Discus conversation stuff
Starting point is 00:34:27 that people haven't really stopped using. They all just go and chat on the YouTube streams now if they want to have a live comment type of thing. So I'm like, well, I'll just take that out. That got rid of the Facebook one. And then, but what do you do about that? So instead of embedding the YouTube player, I said, I'm going to figure out a way to get the picture
Starting point is 00:34:44 automatically from YouTube, the poster. And then when you hover over it, it just has a play icon. It says play on YouTube and it opens up a new window. And I thought I was all clever by just putting the image there, but serving it from Google. No, there's now like the YouTube image servers putting tracking cookies on our site. I'm like, well, come on. Why is this so hard? So now on the server, we use requests. We download the image anytime it has to be shown on a page, put it in MongoDB.
Starting point is 00:35:12 And then if you pull it, we serve it back out so we can like strip the cookies, the tracking cookies out. Nice. And now, now when you look at the tracking content, none detected on the site. But why, why world does it have to be so hard? I just want to- Isn't it amazing how it used to be YouTube embeds were the absolute gold standard
Starting point is 00:35:31 for embedding video on a webpage. Like why would you do anything else? And now actually I'm beginning to think, you know what, post the video, the.mp.mod file or whatever yourself and stick on an HTML5 video embed. And that's probably a better experience
Starting point is 00:35:44 for your users as well. Because, you know, when they click the video on their mobile phone, it'll play full screen and they won't have to hop through to the YouTube app and all of that kind of thing. Yeah, absolutely. Yeah, so anyway, just quick shout out,
Starting point is 00:35:55 like this is taking several passes, but I think it's finally 100% no tracking. I mean, we weren't putting it there before, but like it was seeping in from just like what we might include on the page as content, right? So putting it there before, but it was seeping in from just what we might include on the page as content, right? So anyway, there you have it, Brian. That was my weekend.
Starting point is 00:36:10 That was nice. Well, thanks. I appreciate you doing all that work for us. Yeah. David Coles has the wash hands emoji. There we go. We're all better. Well, I've got no extras.
Starting point is 00:36:22 Simon, do you have anything extra you want to share? I've got no extras um simon do you have anything extra you want to share i've got one um so textual is the you know i'm will mcgougan who's working on rich has been building textual which i know you've talked about on the podcast before what i would encourage people to do is pay close attention because i've never seen a piece of open source software develop this quickly like every day he's posting this video where he's like, oh, and here's the new feature where today he posted a video of it doing full like tree view on a file system, which you could interact with with your mouse in the terminal. And when you clicked on a file, it would open it in a separate panel with like with with syntax highlighting. It's it's absolutely astonishing.
Starting point is 00:37:01 It's like turning into one of the better ways of building a GUI application. It's running in text in the terminal. We could almost have just a section of the show called What's Will Up To? You really could. Absolutely. Yeah, he's re-implemented CSS Grid, the CSS Grid mechanism for terminal applications. It's brilliant. And yeah, I'm just having such a great time watching him do all of this stuff.
Starting point is 00:37:25 And he seems to be live streaming it? I don't think so, but he posts like little five minute videos on Twitter every day of the stuff that he's doing. But I feel inadequate watching him work this fast, but just saying. It's such a delight though. It's like he was born to build this piece of software and now he's building it and we all get to watch him do it. Yeah. It's like he was born to build this piece of software and now he's building it and we all get to watch him do it. Yeah, it's great. Yeah.
Starting point is 00:37:47 Henry Schreiner, hey, out in the live stream says, Textual is amazing. Indeed, it's quite something. Yeah. And I remember when he was trying to name it and Textual didn't even come up on my radar as something that might be possible, but it's so obvious now, like graphical and textual.
Starting point is 00:38:04 Yeah, it makes sense it's cool so hey how about a joke maybe oh man i got some jokes for us uh two jokes the one i'm not really sure how to convey it but i guess i'll do my best i want you to sing no man this is you this is you all right so first one here is uh i could definitely do this one. This one is from John on Twitter, but pointed out to us by Nick Moore, who was previously on the show not too long ago. Thanks, Nick. And this one poses. I think also this is perfect for when Simon is on the show. It says, what do you get when you select a star from goblins, dragons, elves, and comma unicorns?
Starting point is 00:38:42 A query tail. Oh, my goodness. A fairy tale. A query tail. It's goodness fairy tale a query tale it's bad oh wow um well i wanted to share one that people could actually share with their uh this isn't in the list but one that people i just read recently um people might be able to share with their kids um uh in the northwest we've got Sasquatch, right? So, you know what they call Bigfoot in Europe? Big Meter. Oh. It's pretty bad.
Starting point is 00:39:16 Quick tip. If you're ever near Santa Cruz in California, there is a Bigfoot museum in a log cabin in the woods outside of Santa Cruz called the Big foot discovery experience. And it is not a joke. It is very serious. And there was a man there who will take you through all of his evidence for big, big foot. And it takes about an hour. He's got maps and plaster casts of feet, footprints, and a map with pins on it. And it's fascinating. I could not recommend it more. Yeah. I wonder if the COVID pandemic has affected the Bigfoot population. Oh, you should. Well, you can call him up and ask him.
Starting point is 00:39:50 While I was talking to him, he got a phone call to answer questions about Bigfoot. So he will answer your calls. Yeah. All right. Hey, Brian, your joke got it grown all the way from Australia. Nice. Or was it mine? I'm not sure.
Starting point is 00:40:05 It could have been either, honestly. Yeah. I'm going to go with the meter one. They were both pretty bad. All right. Okay. I'll see what I can do with this next one here. So if you're a kid of the 90s, I guess,
Starting point is 00:40:18 it's probably the time. There's Pinky and the Brain. Yeah. And apparently on one of the 10 places I have to write your name, I typed it too quickly and wrote brain. Yeah, and Brett Cannon caught it. So he did a take on Pinky and the Brain, and it starts out, what do you want to do today, Brian?
Starting point is 00:40:40 Same thing we do every Wednesday, Michael. Help Python take over the world. It's Michael and the brain. Yes, Michael and the brain. One's into testing, other's into GUIs. They're both into making Python seem sane. They're Michael. They're Michael and the
Starting point is 00:40:57 brain, brain, brain. Fantastic. I love it. Phenomenal. Thank you. We need to have somebody that's got musical talent to actually put this together as something. So anyway. Yes. Someone who is not me because it won't come out well. So we'll put it in this, with the lyrics in the show notes, I think we should leave them
Starting point is 00:41:15 there so that- We are accepting submissions. Yes. And if they are, if they pass, we may actually play them on one of the next episodes. Oh, I'd love it. Yeah. Could be the new theme song, Brian. Yeah.
Starting point is 00:41:27 The dawning of an era. I'm getting tired of the old theme song. Yeah, exactly. Which is no theme song. All right. Well, thanks. Thanks a lot for showing up, Michael. And thanks, Simon.
Starting point is 00:41:39 Yeah, thanks for having me. Yep. You bet. Bye, everyone. Thanks for listening to Python Bytes. Follow the show on Twitter via at Python Bytes. That's Python Bytes as in B-Y-T-E-S. Get the full show notes over at PythonBytes.fm.
Starting point is 00:41:53 If you have a news item we should cover, just visit PythonBytes.fm and click submit in the nav bar. We're always on the lookout for sharing something cool. If you want to join us for the live recording, just visit the website and click live stream to get notified of when our next episode goes live. That's usually happening at noon Pacific on Wednesdays over at YouTube. On behalf of myself and Brian Ocken, this is Michael Kennedy. Thank you for listening and sharing this podcast with your friends and colleagues.

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.