Python Bytes - #216 Container: Sort thyself!

Starting point is 00:00:00 Hello and welcome to Python Bytes, where we deliver Python news and headlines directly to your earbuds. This is episode 216, recorded January 13th, 2021. I'm Michael Kennedy. I'm Brian Akin. And Brian, we have a special guest, Yousef. Welcome. Hi. Great to have you here. You want to just take a quick moment and tell folks about yourself, maybe about your podcast real quick?

Starting point is 00:00:22 Yeah, sure. Thanks for even being able to participate in this podcast so my name is yusuf um i might not be well known as as you guys are for sure i'm a mechanical engineer from germany and based in germany as well and i'm working for it company called sim scale who are providing cloud-based simulation technology on the site i'm hosting a podcast called engineered mind and I'm working on a bunch of other stuff. For example, my thesis, which we'll extensively cover, let's see, in this podcast. Yeah, yeah.

Starting point is 00:00:50 You used a couple of cool libraries and stuff over there, which we'll feature here. All right. Well, the very first item, Brian. Oh. Let's talk about maybe doing a – should I do a pip search a lot? Yeah. Let's talk about maybe doing a pip search a lot. Yeah, well, I kind of forgot pip search was a thing because when I'm looking for PyPI stuff, I go to pypi.org and just type in.

Starting point is 00:01:13 Yeah, so do I. Yeah, exactly. It's really fast there. Yeah, but there's a feature called pip. You can do a pip search, which the documentation says it's supposed to search for pi pi packages whose name or summary contains whatever query so i can say pip search pi test for instance and it should show me uh well if pi test is is a package on on uh pi pi but right now if you do that it comes back with a big traceback, and it says fault, what, minus 32500?

Starting point is 00:01:48 Runtime error. The pipe guy's XML. Anyway, the API is broken. So this is on purpose. What happened is, actually, I don't know really what's happened, but the service is getting swamped. The search endpoint is getting hit extremely hard. I saw some message or some tweet that was to the effect of, is somebody out there running an insane number of searches against this endpoint?

Starting point is 00:02:16 Please don't. Yeah, I don't know what's going on. So there's some guesses. Maybe it's a rogue continuous integration server or something weird's going on. But in the meantime, right now, we're going to link to a Python infrastructure status page, which has an update on this. So if anybody wants to follow, you can check that out. It says that the search endpoint remains disabled due to ongoing request volume. And I think this really started becoming a problem mid-December.

Starting point is 00:02:51 And so I'm not sure what happened then. And then there's a related issue on GitHub for pip. So there's an issue open, remove the pip search command. So I think the, the end result is, and even the error message says, uh, the, the search endpoint will be deprecated in the near future. So I think that this way to do pip search is just going to go away. Um, so, and that's actually a little surprising because usually a lot of these things are so backwards compatible. Yeah. Um, and there's, there's quite a discussion on because usually a lot of these things are so backwards compatible. Yeah. And there's quite a discussion on the issue thread.

Starting point is 00:03:29 But the gist of it is the current architecture is never designed to handle the volume it's getting right now. So there's a comment at the end of the thread that says if you've got an idea for how to do this algorithm better or a way to do it scaled, go ahead and discuss it. But there's a link to – we're not going to put that link in the show notes, but in the IPI thread or the GitHub thread, there's a link if you want to comment on that. But basically, we're bringing this up. You may have figured out it might be a fluke or whatever, but it's really going on. And a plea to look at your continuous integration scripts. And if you're doing a pip search in there, take those out. It ain't going to work anyway. It's got to be some kind of bot, some automatic thing like this, because it's already given the error message. People would stop if it if it wasn't maybe somebody's trying

Starting point is 00:04:26 like constantly trying to scrape all of the pi pi data out i don't know yeah why do a search that's just weird yeah exactly i don't know what's going on here um but i guess don't do it doctor it hurts when i do this stop doing that so uh the next one I want to talk about is QPython, not QTPython or Qt or anything like that, but QPython, which is a way to do Python on Android. So we've talked about a couple of interesting applications. We've talked about Carnets or Carnet. I think it's French pronunciation I've been told. And that's a really cool way to do like Jupyter on iPad. So local, all these are local, not obviously not just running in the browser. There's Pythonista, which is really interesting. And QPython is also an interesting one for a couple reasons, because you get an SDK and a

Starting point is 00:05:17 REPL for your Android device, which is pretty interesting. But what the reason I'm covering it, I think it's interesting. Somebody, I think somebody sent this over. No, I ran across this myself. Anyway, it allows you just to integrate with the underlying Android APIs and features for automation. Cool, right? So you can do things like check the system. You can send out toast notifications. You can interact with applications. You can mess with the clipboard. You can do barcode scanning, speech recognition, send emails, like all those kinds of things around even, you know, screen brightness or checking your battery or whatever.

Starting point is 00:05:58 So if you want to get access and automate your Android things, Python, well, here's a cool little app to do it. Okay, wait a second. So I'm not an Android user that much. I've got like one Android tablet, but I didn't know it can make toast. Yeah, well, it really prefers sourdough, but it will go even as far as rye if you have to. No, what's toast?

Starting point is 00:06:23 Do you know what toast is? It's like a pop-up notification, I think. Oh, okay. Josef, are you an Android person or an iPhone person? I have to confess I'm an iPhone person. I used to be completely against iPhone, but once you're in the ecosystem, you never get out. It's like the godfather.

Starting point is 00:06:42 They just keep pulling you back in, man. Yeah, I just recently got a new iPhone as well, and I'm general about it, never get out they just it's like the godfather they just keep pulling you back in man yeah i just uh i just recently got a new iphone as well and i'm general about it but because we have our mobile apps uh the training for the courses um i've got an android tablet and i've got an android phone and so on um oh also got a comment here on youtube so um is it an own framework or can you use it in android kot Kotlin, and Java? I believe it's more like an app that you run. And then within that, you can do little jobs and stuff.

Starting point is 00:07:11 So way to aesthetic. It's not something you can bring in that I'm aware of because you install it from Google Play, for example, to get started and so on. But maybe you can plug it in. They do talk about having SDK, so possibly. But I got the sense that's more for like writing code outside than get it on your device. But yeah, pretty, pretty cool. So if you're into Android, you want to do Python automation on it. This is pretty cool.

Starting point is 00:07:34 It's free, get it on the Android store. It apparently has ads, but it's also open source. So go with that. Do you know if there's a counterpart for iOS? I don't know about the automation side. There's a thing called Carnets, which is really cool. Let's see if I can find that. Carnets app. I believe that's how you spell it. Yes, that's Jupiter on the App Store.

Starting point is 00:07:55 And that thing, I don't really want to open the App Store, but apparently I have to. Well, so much for that. But Carnets, it's here. Oh, and it's also on Google Play. Is that the same thing? No, that's a totally different thing. But Carnets or Carnet is a very cool app that lets you do something similar. And there's also Pythonista. Those are the two I know for iOS. All right.

Starting point is 00:08:17 So moving along, Yosef, maybe tell us a little bit about your research and then some of the, you know, one of the libraries you've been working with here. Yeah, sure. So Open3D is one of the possibilities to visualize 3D projects. I had it out of order. Yeah, yeah. Let's talk about PyTorch first. That's fine.

Starting point is 00:08:35 So PyTorch 3D is basically an option, let's say, if you work with meshes. Let's say a mesh consists of edges and points, for example, and these edges connect all the points, and what you get at the end is a mesh consists of edges and points, for example, and these edges connect all the points and what you get at the end is a mesh. So PyTorch, which is for Facebook, Facebook AI research, and they created this framework, so to speak, to be able to work efficiently with 3D data. So unfortunately, I'm using point cloud data.

Starting point is 00:08:59 But the beautiful thing is that if you use PyTorch native application, which you could use for your 3D geometry, it runs, I wouldn't say significantly, but roughly 10 times slower than this PyTorch 3D, which is implemented especially for 3D problems. Wow. Okay. So what kind of problem do people solve? What problem are you solving when you're working with this? Yeah. So in the beginning, it was like I was doing some kind of research unfortunately they are coming out paper like every day and not too like too many actually in the field of deep learning especially when it comes to point cloud or like geometric data and the goal just to inform the audience a bit is um my goal is basically to use deep learning and use some kind of or create an assistant system for engineers and designers.

Starting point is 00:09:48 That means, let's say you're an engineer and we have these CAD models. So CAD stands for computer aided design. So we'd create a model, for example, of a gear and then you would have that gear. But sometimes we have this differentiation between implicit knowledge and explicit knowledge. Explicit knowledge means this is existing knowledge, which we already know about. Let's say this knowledge can sit in a database and sometimes we are not making use out of it. And then we have this implicit knowledge, let's say an engineer comes into a company, is completely new, and he brings knowledge with him to the company. Now, the problem I want to tackle is because we're having so many data and we're accumulating geometric data in a company, we have to make use of that. And my approach is, hopefully,

Starting point is 00:10:23 when I'm at the end of the thesis, which is like in roughly two months, is that I have a system or web application as a front end where the engineer or designer picks or starts a design or picks a point cloud or a design. And then it would suggest the engineer or designer with a probability of what they want to model. Let's say he picks a gear, or maybe you want to have like an arrangement of gears or any specific big component or let's say you take a wheel.

Starting point is 00:10:50 Okay, for example, what would your transmission or something? Exactly for transmission or they pick a wheel and it could be a Tesla or it could be any other car and then it would give you

Starting point is 00:10:58 a probability. Okay, this wheel is maybe from a Tesla and then it would suggest you Tesla with a, for example, 89% probability and then you would click you Tesla with a, for example, 89% probability. And then you would click on the web application. This is the idea.

Starting point is 00:11:08 And then it would pop the geometry into the web browser in the front end. Oh, that's pretty cool. So it basically, it's like image recognition, but instead of for pictures, it's image recognition for 3D CAD outlines. Exactly. It's so cool that you mentioned it

Starting point is 00:11:23 because there's a big difference between doing a convolution neural networks or deep learning for images because images are 2d. It's like a 2d matrix. But if you have a point cloud, then you have a tensor of higher dimensionality. And then you are kind of forced to use, for example, NumPy and all these kinds of things. And if you're lucky, you could use something like PyTorch, PyTorch 3d, which you can also use CUDA on to be way more efficient. Yeah. Yeah.

Starting point is 00:11:47 Wow. That's really cool. So it looks like a neat thing. This is, you know, I haven't done any 3D work for a while, but yeah, it looks pretty cool. I would love to see, I don't know, some pictures and stuff. Wouldn't that be neat? But yeah. They have a very good, like if someone is interested in seeing what PyTorch 3D can do, Facebook has an own youtube channel and they pitched pytorch 3d on that channel and they really do a nice

Starting point is 00:12:09 they show you what you can do with it so it's really interesting yeah oh awesome well i guess i'd never really thought about applying you know ai ml stuff to 3d meshes but it makes perfect sense and i can see it's totally different than images yeah Yeah. Yeah. Very cool. Brian, you don't do any CAD stuff with your devices, do you? Well, I mean, yeah, some people do. Not me, though. There's a lot of CAD that goes on in the ASIC design and stuff. Yeah, I can imagine. Yeah.

Starting point is 00:12:35 Cool. All right. Now, before we get to the next one, I want to get something sorted out, Brian. Okay. I want to talk about Datadog. So they're back to support the show. Thank you, Datadog. Yay. I want to talk about Datadog. So they're back to support the show. Thank you, Datadog. Yay!

Starting point is 00:12:52 And so they're really about helping you troubleshoot latency, CPU, memory bottlenecks in your apps. And if you don't know where it's coming from, Datadog will seamlessly correlate the logs and the traces at the level of individual requests, cross systems, allowing you to quickly troubleshoot your Python app. And they have a continuous profiler that allows you to find the most resource consuming parts of your app in production, just running all the time at any scale. And it has very little overhead. So that's pretty cool. Instead of trying to debug it and then deploy it and hope that kind of translates to production, just turn it on and watch. So yeah, that's cool. So be the hero that got that app back on track at your company, get started with a free trial and support the podcast at pythonbytes.fm slash datadog, or just click the link in your podcast player show notes. Now that that's sorted out, Brian. Yeah. Yeah. So sorting, sorting's a thing and the default Python containers are not sorted. And the, and there's reasons behind that.

Starting point is 00:13:47 But sometimes you need to sort stuff. So there's a Python library called, or a package called sorted containers. I like it. It's a very, I mean, I like the name at least. It's a very easy to remember sort of thing. But this is amazing. I looked into this. So this was recommended by Fanxin Bao recently for us to take a look at.

Starting point is 00:14:09 And it's a pure Python-based sorted collections library. And it's as fast as other packages that are built using C extensions. Wow. That's the impressive part. It's also fairly fairly uh memory safe but the uh the documentation is pretty cool there's a whole bunch of uh different um benchmarks so you can take a look at how it deals with large large things but uh it's really pretty zippy uh it was pretty cool um the right on the front page there's some there's some uh an example

Starting point is 00:14:43 and we're going to throw this in the show notes too, of just, you've got, it handles a handful of different data types. It shows sorted lists, sorted dictionaries, and sorted set. There's also a sorted key list, and I had a function to use to create a key for sorting. Right, because the things in there might not have a natural sort, right? Like if you put a bunch of order objects in there, well, how do you sort those? Do you sort them by price? Do you sort them by date? Right.

Starting point is 00:15:24 So you select out that element, yeah? Yeah, you have to select it out. Or you can do something like they might be sortable by default, but you want it to be like a reverse sort or something like that. Right. And there's some caveats listed so that you have to make sure that the key that you pass in follows some conventions like two identical items should have the same key. Stuff like that.

Starting point is 00:15:48 It's all reasonable things. But it's a fairly easy and complete package to just use. It acts just sort of like the normal thing. Containers like lists and dictionaries and sets. It just remains sorted all the time. And this is pretty incredible. Yeah, I can totally see bugs get into your code because you're like, well, we put stuff into this list

Starting point is 00:16:12 and oh, I want the latest one. So it's the last one, but maybe, you know, you forgot to sort it before you did that or the first one's the last because you reversed it or whatever. So one of the things that confused me when I first looked at this, I was scratching my head for a second because it looks like a fairly simple set of uh like examples with

Starting point is 00:16:29 just like a small set of elements in it so like the first one is a a list of like a b e a c d b just you know a few characters and you know it's a whole bunch of these examples with just a little small amounts and it says uh uh underneath this is all of the demo listed above takes a gigabyte of memory. And I'm like, what the heck? Why is it taking so much memory? It's only five things. Come on. I mean, like, why?

Starting point is 00:16:58 It's cheap. Don't worry about it. There's hidden in there. There's an example of that five character list, sorted list that gets multiplied by 10, 10 million. So it's a 10, like a 50 million, 50 million characters in a list that got sorted. Right.

Starting point is 00:17:17 Yeah. So, and then like things like, and then all the operations like count. So you can say, count all the season there. It'll tell you how many there are. And a lot of these operations like count and stuff with a sorted set take less than linear time.

Starting point is 00:17:33 So yeah, so there's times you need sort. And this is a cool one to check out. Yeah, it's cool. It's nice that it's pure Python. Super easy to install, right? It's not going to have any like weirdness around that like if you say got an m1 computer and thing won't compile or whatever no this looks really cool yosef what do you think yeah this looks amazing um i'm also i'm in touch with my brother

Starting point is 00:17:56 um on the site and he's also watching our podcast at the moment and he's also saying because of the one gigabyte memory for sorting is it's incredible. It's crazy. Yeah, that's pretty awesome. I guess it's just showing like you can have a ton and it's all nice. So it's, I mean, it seems really straightforward, but having these things sorted. We just got dictionaries that would stay put. So having sorted dictionaries is also cool. Yeah.

Starting point is 00:18:22 Right. It used to be that they sort of, if they had the same keys and stuff, they wouldn't necessarily retain their order of the things you added, but now they do. Right. So if people are confused and think, well, aren't dictionaries already sorted? No, they just stay in the order that they were created. Exactly. Yeah. Similar, but not exactly the same thing. Yeah.

Starting point is 00:18:43 All right. So I want to, this next one, I want to riff a little bit on typing and I want to do that around a tweet, which I think I've got to put into a different type of, hold on. For some reason, Twitter has stopped showing me like the entire conversation of things. I don't know why, but I guess it doesn't really matter.

Starting point is 00:19:02 So Lucas Lenga responded to a tweet that went out there. You know, Lucas is obviously his core developer. He's been doing really important stuff. But one of the main focuses that he's been working on is around type hints and typing with things like MyPy. He was instrumental in bringing typing to Facebook and the Instagram codebases and things like that. So there's a tweet that says, Controversial take. Types in Python codebase are a net negative.

Starting point is 00:19:30 That's not Lucas. He's about to have a whole long conversation about this that I'm going to talk about. But Brian, what do you think? You retweeted this. Them's fighting words. Them's is fighting words. Yeah.

Starting point is 00:19:43 So what do you think? I think that they're good. Yeah, I do too. I think when I first saw them, I was a little concerned, like, oh my goodness, this is going to potentially, you know, turn Python into something like TypeScript. And while I appreciate what TypeScript does to make JavaScript much better, I almost always walk away from working with TypeScript with a feeling of like, ah, that kind of hurt and was painful. I wonder why it had to go that way, you know? Because the TypeScript requires, it's like C Sharp or C++. The types have to match and they have to be there.

Starting point is 00:20:18 And if they don't match at all, then it just won't work, right? It's super frustrating. Oh, this thing is not defined. And, you know, there's, because there's libraries that might not have types. And then how do't work, right? It's super frustrating. Oh, this thing is not defined. And, you know, there's, because there's libraries that might not have types. And then how do you work with them? And it's just, if I find it, there's always some little edge case. It's like, oh, this is frustrating.

Starting point is 00:20:32 But I never feel that way with Python. And I really have come to love Python's type hints. And obviously Lucas starts out his conversation saying, this is easily disproven. If you ever use PyCharm or VS Code, the code completion in there is based on type annotations. If you've ever seen your editor highlight a function and squiggly say this expects something else than what you're giving it, you know, besides the number of variables, but like you're giving it a string and it wants a number or something

Starting point is 00:20:59 like that, you're using type annotations and you can enhance your code by doing that, right? So I was actually talking to Yusuf about this yesterday. My philosophy, or maybe my rule of thumb is you don't have to always do it this way, but if you're working in your editor and you have to type more than say three characters to get some kind of symbol to come up, you're probably doing it wrong. So like if you have email service and you want to have email service, send account email, you should be able to say dot S A E S A E, right? Send account email. And it should know the type that's been returned, what an email service is, that it has this property and just write it for you. Right? So to me, a lot of the typing stuff, I know, I mean, this comment is somewhat about bugs. Like I never found a good bug because of this. To me, that's almost like a side benefit. It's about quickly generating

Starting point is 00:21:50 code without stopping to go look at the code definition, without going over to the documentation to see what I could have typed over here. You know, it's, for example, AWS people, this is insanely frustrating to work with AWS because you get these like weird, create this service and you give it a name and then you get an S3 service back. But it has no idea that it's an S3 service. So you get zero help on what anything, even I think go to definition doesn't quite work because it's, you know, use some factory method to reach down some weird place and get the thing. So I think really driving the code generation experience without being in documentation, without jumping around and reading all the source, just go forward. I think it's super nice.

Starting point is 00:22:31 So to me, that is the biggest win of all of this stuff. So let me give you- The entire thread is very interesting. So let me touch on a couple of the points of the thread because I can't get it to come up in the screen share, but that's fine. I took notes luckily. So some of the things he pointed out, he's like, here's tweet one of 10. So number one, put enough annotations and then the tooling will connect the dots and make plenty

Starting point is 00:22:55 of errors evident as well as like heighten this code generation auto magic, right? That's one. The most common types of errors though, that'll creep in is if, if none is being used where you expect a concrete, concrete type and things like my pie will say, you're using a type that is an optional of something, but you're not checking to see if it's none before you dereference it, you're probably going to end up at some point with an attribute. None type does not contain attribute, whatever you tried to do, you know, upper or whatever, right? None is not scriptable or subscriptable. Yes.

Starting point is 00:23:29 Yeah. Something like that, right? Or callable or any of the things. Also, another common bug is the return case. So if you've got a function, you can, you know, maybe check something and return one value, check something else, return another value. But if you forget at the end and you fall through and you don't put up some kind of concrete return type, Python functions just return none.

Starting point is 00:23:53 Like this actually blew me away when I learned Python and I learned about functions that they always, always, always return something. There's no such thing as a void function in Python. Yeah. As a C++ person, that probably surprised you too, right? Like with with your c++ background i did a bunch of other curl and stuff like that so but the uh the um the return type is actually one of the greatest documentation features as well because uh sometimes you can try to you can kind of figure out what the the parameters are going to look think you know you can guess yeah but what's the return type? Is it going to be a

Starting point is 00:24:26 list? Is it going to be a tuple? Is it going to be a single element? What if there's more than one element? Having type hints around the return types is a great feature. Absolutely. Let me touch on a couple more. I see some listener comments in the stream as well.

Starting point is 00:24:41 Squiggly lines in your editor, anyone? I just got this the other day. I thought I was supposed to pass an object id the primary key manga but we'd overridden it and it's actually a string and said you're passing a object id when you expect a string i'm like oh yeah i guess i am all right well change that right that's really nice instead of that being a runtime error and um he talks about the work with TypeScript and Anders Hausberg and what he did to help build that. And TypeScript, like I said, is pretty neat. But he also points out that, you know, the same company, Microsoft, is developing powerful type checking and code completion for Python with VS Code. And they're, you know, they have one of the Python steering

Starting point is 00:25:20 council folks working on there. And maybe that's Brett. And also possibly the Python creator himself, Guido. So do you think those two people would be working on something that just provides the illusion of productivity? Probably not. So let's see a couple of comments. Chris May. Hey, Chris. Happy to see you out there.

Starting point is 00:25:38 He says code completion is such a confidence builder too. I think it's so awesome because for me, it's both amazing for beginners because they can type dot and go now what? And for experts, they can just blast out code so quickly because you just type dot a few things and you know, like you said, with confidence, you just keep going. Um, but a lot of these, oh, sorry. A lot of these features, um, you get them. If everybody around you writing the code that you're using is using type hints um you don't necessarily need to use type hints yourself but then you're being a bad citizen and not helping the people out that you're sharing code with so if you don't share code at all and you're only working on projects with yourself then you know go ahead don't use type yeah it's up to you right

Starting point is 00:26:20 yeah absolutely yourself what do you think this uh? Do you guys use type hints on your project? No, not really. Like it's not something that's in our conscious mind, I would say. I'm not sure if that's also something really because you're an engineer. I wouldn't want to generalize, but engineers are usually bothered with the problem itself rather than digging down on types, for example. It depends. It depends on what language we use. Yeah. It's a bit of a computer science-y topic, I can see. But yeah, like I said, I love how it generates the content so much easier. Maxson also commented, I love, for example, Pydantic,

Starting point is 00:26:56 but I agree with Romalo, Luciano Romalo, who was in this thread. Hopefully it won't be required in Python to help people get started. Yeah, so I think the typing stuff is really interesting. Like Pydantic, we've talked about a bunch. It's a super interesting example of really using typing to generate cool data ingestion and processing. Like if you say I've got a Pydantic model and one of its fields is a list of integers, but you give it a list and the things in the list happen to be strings that could be

Starting point is 00:27:22 integers, it'll automatically convert it and And stuff like that is really fantastic. Yeah, I think that's always going to be an add-on type of thing. Yeah, even though I'm a fan of type hints, I don't use them all the time. And I would be very opposed to having them be required. Yes, I would too. I would too. I don't think they need to be on the whole code base. I mean, it depends if your goal is to say,

Starting point is 00:27:43 I want to use them for mypy or mypyc and completely generate stuff. But if your goal really is to get a little bit of help with editors, just having it on the boundaries. Like, here's the data access layer. The things that come out of there return whatever, and you don't have to do anything else, and the editors will pick it up and run.

Starting point is 00:28:00 Yeah. Yep. Yep. Yep. Alright, one quick question. What is a function return if there's no return? It returns none. It returns returns none so that's why you don't ever just say whether there's a return type it always returns none all right next up i guess uh we got the one i tried to open with there uh yusef is um open 3d that looks fun yeah this is uh basically a library which you could use in jupyter which i try to use but somehow they at the moment have problems using Open3D. So what you can do is you call Open3D in your Jupyter notebook

Starting point is 00:28:29 and then have the point cloud visualized. However, there are some ways around it, but Open3D, I think if I would start all over again, I would probably use Open3D to visualize my point cloud, which I'm actually working with in my Jupyter notebook. I'm not sure if using a Jupyter notebook is also something you would recommend personally, maybe Brian and Michael, if you're a fan of Jupyter notebooks. I I'm not sure if using a Jupyter Notebook is also something you would recommend personally, maybe Brian and Michael,

Starting point is 00:28:46 if you're a fan of Jupyter Notebooks. I think it depends on the application, right? Yeah, I think it depends as well. And to me, it really depends on what I'm trying to do and like the kind of code. Am I trying to explore data and does it have a really strong visualization component or is it like a utility type thing?

Starting point is 00:29:04 So for example, one of the things that I wrote recently that I would never put into a Jupyter notebook, but I find really helpful is we've got literally thousands of video files, MP4s and whatnot for online courses. And in order to import them, one of the things I have to tell the database is how long in seconds is each file and where does it live and stuff. So I've got a little script and I just say, go to this directory and generate a little JSON output for all of the files and parse them and tell me how long they are.

Starting point is 00:29:36 Like that kind of app doesn't belong there, right? It's just that's a command line type of utility type of thing. But if I want to visualize something like this, I think it may well be really good for it, actually. So I think it varies. Yeah, there's a lot of application parts of my work that I think using a Jupyter notebook actually might be more beneficial. So I'm often taking big, huge trace datas

Starting point is 00:29:59 and stuff for like spectrum traces. And those could easily be driven from a Jupyter notebook and with the visualization stuff would spectrum traces. And those could easily be driven from a Jupyter notebook and with the visualization stuff would be good. Yeah, cool. So this thing is a set of both C++ and Python libraries for basically working with

Starting point is 00:30:18 3D meshes, right? Mostly 3D data. For example, if you use a LiDAR, so when you work with a laser and... This looks for example great. I never watched the video,, if you use a LiDAR, so when you work with a laser and... Right. This looks, for example, great. I never watched the video, but if you scan objects in your surroundings,

Starting point is 00:30:30 usually what you get is a point cloud and which you can then visualize using Open3D. And the big disadvantage with point clouds is that they're kind of unstructured. So you could have one matrix representing one point cloud and you could have the same matrix switching two points, but the matrix would be different. This is also a problem that a lot of papers try to tackle

Starting point is 00:30:50 and make sure that, you know, get around the bottleneck. Nice. The example video here is using Open3D for 3D object detection, which is pretty wild. Yeah. Nice. The things people do these days. I know.

Starting point is 00:31:05 I think it's really interesting, all this image processing and analysis stuff. Good question, Brian, by the way. This is what I ask myself when I listen to Python Bytes. As an engineer, what are you guys doing? It's great. Absolutely. Cool. All right.

Starting point is 00:31:20 Well, that's it for all of our main items. Brian, you got anything extra that you want to throw out there um yeah i just wanted a couple things one uh 2021 has been exhausting so far um yeah i don't know if anybody else has got the same experience but wow um and i also uh i've got a lot of extra projects side projects that i'm working on right now uh python bytes is one of them but there's other stuff going on as well, trying to do more writing. And because of that, test and code has shifted to an every other week cadence. So it's not going away. I know a lot of, oddly enough, I've had a lot of feedback in the last couple of months of people saying, thank you for the podcast. I've learned so much. So I do not,

Starting point is 00:32:02 I don't want to shut it down. I want to keep it going and there's no plans on shutting it down. It's just slowing down so that I have room in my life for other, other projects as well. So just wanted to let people know that. Yeah. Well, yeah, I try to, for Talk Python, batch it up and do a whole bunch. Just to say this week, I'm just going to get nothing done, but I'll do a ton of recording and then just roll them out. I had three months of stuff done in like a week and a half. I was, I really needed a break after that, but then I was good. Cool. Yeah, cool.

Starting point is 00:32:28 Well, thanks for the update. Yousef, anything you want to share on the way out, end of the show? I just want to say thank you for letting me or being able to participate in this quick and brief podcast. Keep doing what you do, guys. I follow you both on Twitter, and what you share and what you do is really amazing. So it's really inspiring

Starting point is 00:32:46 for an engineer who wants to delve into the field of Python and all fancy kind of things to listen to your podcast, taking your courses or following you on social media. It's really great. You learn a lot.

Starting point is 00:32:55 And I actually have to learn more, to be honest with myself. You know, I have to learn more. Yeah, it never stops. It never stops. Yeah, but that's awesome. Thank you so much. Really appreciate that. How about you, Michael? I have one learn more. Yeah. It never stops. It never stops. Yeah, that's right. Yeah, but that's awesome. Thank you so much. I really appreciate that.

Starting point is 00:33:06 How about you, Michael? I have one quick thing. Driven by yours, a comment you had last time. So Francisco Silva pointed out, we had talked about some of the numpic, Pythonic, the idiomatic numpy stuff that you might do and how instead of looping over stuff, you can just

Starting point is 00:33:25 like add say like two NumPy arrays and it'll add them or you can, you know, dot product them and whatnot. Right. So one of the things you can do, I guess we also talked about like ones and zeros to generate a pre-built list of those. So one of the things he talked about is the all close method. So if you've got floating point numbers, one of the things that's really frustrating is like, are these equal? Well, does it mean floating point numbers equal, right? Like they could be so nearly the same, but not the same, right? They could be within an insane amount of closeness, right? Like 10 decimal places and then a one, right? So all close is like, well, if they're within, you know, one, one thousandth of each other, consider them the same. Well, all close takes a bunch of parameters that you can you can specify the tolerance, though.

Starting point is 00:34:09 Yeah. Anyway, I thought that was cool. Yeah. Hey, while we're on the topic, I may as well throw out I've got it. So I tried to use this method of using NumPy and I ran into a problem. So I'm hoping some data science people can help me figure out how to solve it. So my problem is just the simple thing. If I've got two arrays, I want to see if all of the elements are element-wise less than or equal to the other element in the other array. Okay, I can do that with NumPy. But what I can't, that assumes that all of the elements are the same data type, like comparable. If there are strings thrown in there it doesn't work so obviously i

Starting point is 00:34:47 don't know if it's obvious but so i gotta i had to do some cleanup at a time but i don't know what the most the best way is so reach out to me if you've got an answer awesome yeah i don't have an answer but i'm sure people do and uh quick quick comment uh here this is the one on my show um magnus carlson, tip, I found out about Copier, an alternative to Cookie Cutter that can be run later as well to update the project to a newer template. That's pretty cool. I hadn't

Starting point is 00:35:14 heard of that. And also Toml's spec has reached 1.0. Parser might be added to the standard lib. Also haven't covered that, but that's cool news. Thanks for sharing, you guys. And I guess thanks for being here. Yousef, thanks for joining us.

Starting point is 00:35:28 And Brian, thank you as always, man. Thank you. Thank you so much, guys. Bye everyone. Bye.

Your Ad Here

Python Bytes - #216 Container: Sort thyself!

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.