Python Bytes - #284 Spicy git for Engineers

Starting point is 00:00:00 Hello and welcome to Python Bytes, where we deliver Python news and headlines directly to your earbuds. This is episode 284, recorded May 17th, 2022. I'm Michael Kennedy. And I am Brian Ocken. And I'm Daniel Mulkey. Daniel, great to have you here. Thank you. It's an honor. Yeah, it's an honor to have you.

Starting point is 00:00:17 Now, before we get into our first topic that Brian's going to tell us about, just give us a bit of your background. Sure. I am a optical engineer in Southern California, but I have a significant amount of my time spent using Python for data analysis, instrument control, and other things. So I've been doing it for a better part of the last five years, and I've had a back-and-forth relationship with MATLAB

Starting point is 00:00:40 and am finally married to Python, so to speak. Fantastic. You've finally been able to get out of your dysfunctional relationship with MATLAB. Yes, exactly. It sounds a little bit like you might live in a parallel universe to Brian. Yeah, it sounds like it. We should definitely get you on testing code and we can BS about that. Sure, yeah, I'd love to.

Starting point is 00:01:00 Brian, I would love to hear about our first topic you want to talk about. It sounds very distinct, you know? Distinctify, yes, very distinct. Brian, I would love to hear about our first topic you want to talk about. It sounds very distinct, you know? Distinctify. Yes, very distinct. So I ran across this. I can't remember how I ran across it. I guess it doesn't matter.

Starting point is 00:01:17 But one of the things I like, it's a Python package called distinctify. And it's very simple. It's a lightweight Python package to provide functions to generate colors that are visually distinct from one another. So I was thinking like, you know, you got a chart, like maybe you're taking user data or something, and you don't know how many lines you're going to plot, but you're going to plot a whole bunch of lines. How do you pick the colors for what the lines are? So this is a kind of a neat thing to just pick visually distinct uh colors um pretty pretty focused but it's pretty cool and all you do is you kind of just give it um you give it like the number of colors you want and it gives you back the colors and you can it has display capabilities so you have to install extra stuff to make that happen.

Starting point is 00:02:05 But you can display color swatches too with it. And I was looking at some of the different colors that are available. Like one of the ones was 15 different colors. I think it's 15 colors for normal vision versus some colorblindness. So if you have colorblind people, you can pick based on some of that stuff. There's a whole bunch of examples in the, uh, the repo too, that it's kind of fun to look at. Um, one of them was the, like the normal colorblind one. Oh, was that it? No, that wasn't it. But there, there's some really cool examples, um, uh, different colors. So if you just give it

Starting point is 00:02:42 a few, it just grabs a few, of course, but there's a whole bunch of neat ones, clusters and things. So anyway, cool little library. It's great. Yeah, I like that they have, I noticed when I was looking through it, they have a function for generating a color palette. And so you can generate a colorblind friendly palette. So hypothetically, that works well for visual colorblind. And if it's in print and you're doing black and white.

Starting point is 00:03:06 So that was the most interesting thing to me. Oh, do you know who has black and white? That's interesting. Well, at least I think if you take a colorblind palette and you make it black and white, typically it's still a decent contrast. Oh, yeah. You have to worry about printing things out. Oh, that's cool. Yeah, that's great. And one of its functions is to take the color map that it generates and turn that into a map plot lib.

Starting point is 00:03:25 Oh, yeah, yeah, yeah. Which is cool. Oh, that's what I was looking for. Yeah. Oh, wow. And there's somebody in the audience who just found out they're colorblind. Yeah, go ahead, Daniel.

Starting point is 00:03:35 No, just kidding. And there's somebody in the audience who just found out they're colorblind. They're like, is there a difference? What is this? Yeah, one of my kids found out like in high school that they were colorblind um so interesting yeah how would you know yeah for a long time you're just like people tell me that's a color i guess i'm not great at picking out that color um uh

Starting point is 00:03:57 an art teacher said i really love how you use both blues and greens in the sky and she was like i intended to just use blue but thanks i i have a friend who went to art school and that was essentially his story that he always had really vivid color choices because he didn't see the same as everybody else it was great it was awesome that's pretty cool yeah cool all right ron we ready for the next one? Definitely. Okay. So let's talk about SQL Soda or Soda SQL. So this is a open source CLI tool that if you're doing like ETL, like ingest transform loads type of stuff,

Starting point is 00:04:38 doing other sort of analysis or exploration of SQL data, it allows you to connect to your data source, like your database, and then define tests for what invalid data looks like, right? Does this have to be a number? Can it, does it just have to be not null? You know, what is it? So for an example, here, they're talking about, here's the YAML file for a like a warehouse a data warehouse reporting type thing for postgres so you just set up like your connection and your host and and all that kind of stuff and then off it goes so pretty neat and then you can scan your data set to run tests against your data isn't that cool that's right it's soda cool it's soda cool It's soda cool. It's soda cool. It is soda cool.

Starting point is 00:05:36 Yeah, so you just say soda scan and you give it the YAML file for the connection information and then a YAML file for the types of things you want to test. So they've got this example of how you're talking to one of the data warehouses and it's going and pulling in these config files. And basically this example, it's testing 54 different conditions. Three tests were executed. Everything's good to go. So, you know, if you're getting kind of data dropped on you or you're scanning, you know, scraping data from other places on some kind of background job and you want to bring it in, you know, if it's all automated, how do you know when it goes wrong? Right.

Starting point is 00:06:03 So here's a nice, simple way to express that yeah that's neat yeah and uh brandon out in the audience says i think we're looking at great expectations for this same thing and yeah this is kind of a i guess my first impression is this is a less code way of doing what great expectations does right so like you can just put together some YAML files that define, you know, what you want to test for, right? So for example, in this YAML file, I can say the metrics are row count, missing count, and missing percentage. And then I can test that the row count is greater than zero, right? And then another one is for the column, for the ID, it's a UUID that it's, I'm allowing 0% of the UUID format to be invalid, right? You know, that's got like a

Starting point is 00:06:46 certain structure to it, right? It's like a, either a straight UUID or a string that looks, that can be parsable over to one, I'm guessing, something like that. So pretty cool. I think that's probably the biggest difference. So if you just want to define kind of like declaratively, like here are the conditions of which I want it to test. And then you want to just set it up to continuously scan it. It looks good. The invalid percentage looks interesting because it's, it's an interesting addition of like, you know,

Starting point is 00:07:12 there can be some bad rows, but we don't want more than like 20% bad rows or something like that. Right. Right. Maybe you can't have zero errors, right? Like you just, sometimes the data is just not there. But if it's 100% not there, then something's gotten terribly wrong or the data formats change and it's not called that anymore or whatever.

Starting point is 00:07:32 JSON, who knows? Daniel, what do you think? My data is always in CSV files. So I have, I guess there are pros and cons to never having touch SQL, as I've heard from someone. Much, much easier to version control. Just put the CSV to version control. Just put the CSV in version control.

Starting point is 00:07:48 Yeah, anyway, I think this one's pretty neat. People can check it out if they're doing relational data stuff, especially if you're doing a lot of on-demand, not like you ask for it, but it's just on-demand processing. You're given a database and you want to check it out to see

Starting point is 00:08:04 how it's doing. I won't go on anymore on that because i've got a ton of other extras so kick it over to you daniel cool so let's see there was a review article back in 2020 published in the research journal nature for anyone not in the research articles world, nature is one of the top level ones. For reference, in grad school, we had some fancy work we did with quantum entanglement, and we got rejected by a subjournal of nature. So to get anything into nature is highly non-trivial. It's like the JAMA, the Journal of American Medical Association of science, basically. It's absolutely one of the top ones.

Starting point is 00:08:46 Yeah. I will say it's a review article. So it's easier typically to get a review article than to say, hey, this is bleeding edge research that's going to change the world. But still, the big news is two things. One, that there is a article by Travis Elephant and others on array programming with NumPy in nature. It's a big enough deal that they chose to publish this.

Starting point is 00:09:06 And it got through. And it's, I think, very significant that that software was something that was good enough to publish. The other, and they go through and they talk about kind of the fundamentals of it all. There's one diagram I really like that sort of shows how the whole ecosystem stacks up. You've got NumPy as the base. That's a cool visualization. Yeah. And then you got SciPy as the base. That's a cool visualization.

Starting point is 00:09:27 Yeah, and then you've got SciPy and Matplotlib and the other plotting libraries. So there's the foundation. Yeah. I was just going to say, for people who are listening, it's like the tree of life for scientific libraries. Sorry, go on, Daniel. That's absolutely right. So from that foundation as far as algorithms and plots,

Starting point is 00:09:43 you go up to a specific method you're using. Are you doing image processing? Are you doing machine learning or something else? And off to domain specifics like AstroPy. And I think you've had those guys on Python, so you've gotten to talk to them. And then down to very application specific. So NumPy serving almost everybody who does anything numerical down to Qtip, which is used for people working on quantum computers. Very large breadth being discussed here. Qtip, that's so cute. I like it.

Starting point is 00:10:14 It's notable that Python got into nature. If you go search for Python, there are a lot of other articles. It's also interesting to see that they're willing to publish software. You guys have talked in the past about how you can't always publish a software package in any research journal.

Starting point is 00:10:29 So how do you get credit for that if you're in academia? But this is an interesting take to see that nature chose to publish it. Yeah, this is super interesting. I think it's very valuable to just raise awareness, right? This is the water that we swim in,

Starting point is 00:10:42 but not everyone. Everyone is immersed in the Python data science tooling, right? Yeah. There's know, this is the water that we swim in, but not everyone. Everyone is immersed in the Python data science tooling, right? Yeah. There's a lot of authors on here. Yeah. I was trying to understand. I'm guessing those are the maintainers of the packages that were included, but I mean, you don't have 20 people write one paper, so I don't know how. It's kind of like the LIGO papers or like the gravitational wave interferometer ones where like this crazy list. It's like the first page of the article is almost all authors just because there's so many people that worked on this for so long.

Starting point is 00:11:11 So I'm guessing that's what the story is. And you can access it. Some, some articles, some journals, you can't actually read it unless you have a subscription, but this one's available. Indeed. Yeah. A very cool pick before we move on maybe you know daniel alvaro and audience asks have any of you come across a way to validate panda's

Starting point is 00:11:33 data frames against a schema much like sql uh soda soda sql thought of my scope i feel like we have but i don't remember but yeah i don't remember. Yeah, I don't remember either. Sorry. Maybe something we should seek out for the next one. And I think we might get some answers in the audience. So we'll let them inform us as we move on. So Brian, what's next? Well, this isn't Python specific, but I think a lot of Python people are using GitHub Actions.

Starting point is 00:12:05 So GitHub announced, I guess recently, supercharging GitHub Actions with job summaries. It's an article that we'll link to. And basically, it's pretty cool. I can't wait to try this. I'm using GitHub Actions. And the gist is you can now have Markdown go directly into your GitHub job summary sort of thing with like this crazy global variable called GitHub step summary. But it's got Markdown to it. And I'm like, well, what can you do with this, though?

Starting point is 00:12:40 But Simon Wilson released was tweeting about it and uh and then said and then ned batch elder said hey i'm using it too so ned um has a little example on his on coverage.py that shows uh what does it show it shows um you you get this nice total coverage percentage if you want to put that in the coverage for your repo, you can do that. Interesting that coverage.py is not 100% covered. I don't know why I find that funny. The irony, I love it.

Starting point is 00:13:19 But, and then, so Simon also listed Dataset has an example on Dataset. You doing adding some extra stuff to what is he adding changed files? Oh, he's got a tool that does looks for how many files have changed and recently. And he actually just wrote he just wrote a write up for that. So we're linked to that as well. So GitHub Action Job Summaries. And he shows how it works. You can pop out stuff.

Starting point is 00:13:51 And I love Markdown. Even little code fences and all sorts of stuff. That's very cool if you want to structure something real nice like that. Yeah, it even has... So supposedly it's got a whole bunch of stuff. It's got like... You can do tables even. So that's neat.

Starting point is 00:14:06 And emojis. Why not? Oh, yeah. Pretty cool. Put a little fire emoji in there. Yes, do it. Does anybody get images? Like, if you create an image during the action, can you reference it?

Starting point is 00:14:16 I don't know. It doesn't mention images, but... Maybe you could base 64 and code it and embed it as a data URL. Oh, wow. It even does uh mermaid uh which is a way to do diagrams um within it that's pretty nice very nice like flow charts yeah fantastic this is a good one i need to learn to do more with github actions i don't do very much with them i love them they're like it was i used to use travis back in the day, and I think these are way easier.

Starting point is 00:14:47 Daniel, do you do any of those sorts of things? Any CI automation type stuff? At a previous company, we used Azure DevOps and set up some stuff to build packages and build applications, but not at the moment. It doesn't happen to be any code bases I have that need that. Yeah, very cool. All right.

Starting point is 00:15:06 Well, I've got an interesting one here I want to dive into with you guys. So this one, let me give some attribution here. This one was sent over by Intimar from Meta. And then this is a write-up by Alex Waygood. And what it is, is it's basically the notes for all of us who are not there for the 2022 Python Language Summit. So that's pretty cool. There were around 30 core developers, triagers,

Starting point is 00:15:34 and special guests gathered the day before PyCon. And so they had a bunch of different talks and ideas they discussed. Quick summary, really it's about, so much of this is and ideas they discussed. Quick summary. Really, it's about so much of this is about performance and parallelism right now. And then there's a lot of maintainability, back channels, back flows here. All right. So coming to these first, Sam Gill made a huge splash last year when he talked and he introduced the no-gill work that they had done for, I thought, in 3.8, I believe.

Starting point is 00:16:08 I can't remember, 3.8, 3.9. No, it was 3.9 for them. Cinder was 3.8. So for 3.9, and there's a lot of interesting optimizations and whatnot in that talk. So the idea is, could we live without a global interpreter lock? Larry Hastings tried the gillectomy, sort of said, you know, it's too much of a penalty to try to live without it. But this no-gill work that Sam Gross did actually had very small overhead in terms of what it added, but potentially removed some of the gill things. So there's a lot of analysis of that um people were excited but they

Starting point is 00:16:47 how is it written it says robust there was robust questioning one i guess one of the biggest parts that they discussed was maybe this should be a fork of c python there should be a no gil version of python and but sam is like i really don't want to have just another separate version of python i really want this to just help everyone so um pretty interesting i think originally it was maybe going to be a runtime flag you could pass to python but it's looking like it more likely is going to turn out to be a compiler flag so you'd have to have a no gill build even though it's from the same source code. So yeah, a bunch of interesting things,

Starting point is 00:17:29 concerns about how it's going to work with C libraries and so on. But all these are pretty interesting read-ups, reads, write-ups. So Eric Snow did a presentation on his per-interpreter-guild, which is interesting in how it approaches a slightly different problem Eric Snow did a presentation on his per-interpreter gill, which is interesting in how it approaches a slightly different problem than, say, Sam Gross. So Sam is trying to get it out of Python. Eric is saying, well, if we could just have a sub-interpreter, like a little mini in-process interpreter

Starting point is 00:17:58 that runs per thread, then they can all gill to their heart's content. It doesn't matter because it's all single-threaded, right? But what's interesting is if you go look at this one in here, we've got this one. It says something like way back in 1997, this idea of multiple sub interpreters was added by Guido, but it really hasn't, nothing has been done with it.

Starting point is 00:18:23 And when somebody tries to do stuff with it, there were thousands of global variables. And if you're going to have per interpreters, you have to somehow have those not shared because then you're going to have the gill back on them, you have that locking. So due partly to the deprecation of some of the old libraries and stuff, it's gotten a little simpler, but no, that was for the next write-up. But anyway, they've reduced this to almost 1,000, to 1,200 remaining globals. So needless to say, it is not totally solved here, right?

Starting point is 00:18:57 So again, one of the possible worries of all this stuff is, well, how are the C extensions going to deal with this? They don't know about multiple subinterpreters. Yeah. So anyway, that's another one of the main threads going on there. Let's see. Then this is probably the biggest deal. This is a faster CPython 3.12 and beyond by Mark Shannon and Gita Van Rossum. So stepping back a release, Python 3.11, if you haven't heard't heard is fast it's supposed to be 1.25 times faster than 3.10 how about that yikes this blows me away in one year they were able to make python 1.25x faster and it's been out for 30 years it's not like oh well we released it last

Starting point is 00:19:41 year now we've learned some things you It's really, really, really solidified in the way that it is. And then still there's a lot of work. And this apparently is just the beginning. This is like a five-year plan to add all sorts of optimizing JIT compilers and all sorts of things. How did they quantify that

Starting point is 00:20:00 or what subset of the language was that tested on? That's the tricky thing to say. Python is 25% faster. Doesn't matter what you do. Even if you're just waiting on a database, it's still 25% faster. Does it just overclock your computer in the background? It liquid cools it. I believe that number comes from the unit tests, like all the tests for CPython.

Starting point is 00:20:22 I'm not 100% sure, but I believe that was the conversation. And so one of the big things coming is possibly a JIT, an optimizing JIT compiler. So right now they've found a way to optimize individual bytecode instructions to make the runtime smarter and go, oh, I see what you're trying to do. We could have a specialized version of that. But that's on a per line basis. Like how about

Starting point is 00:20:46 inlining this method? Because I only see it called in two places or something like that, right? So you need something that can look more broadly at the code. So that's this idea of the JIT compiler and so on. So yeah, this is really good. But all three of these things I've talked about are like both, they might help each other, but they also might inhibit each other, right? So like the no-guilt work might interfere with some of the optimizations that they're doing over here, and the multiple sub-interpreters also might be some interplay that they've got to be worked out.

Starting point is 00:21:16 So I'll just summarize the rest. WebAssembly, and so we've talked about PyScript last time and Pyodide. This is the official CPython build target for just CPython. So this is really interesting. That is sort of more from the core devs rather than somebody coercing CPython into a different build on their own. So that's pretty neat.

Starting point is 00:21:41 F-strings. Apparently the F-string parser is kind of this weird side parser thing. That's not actually part of the Python code parser, but now we have peg, the peg parser. It can support more of this and sort of unify that. So yeah, there's something like 1400 lines of customized C code for parsing F string. Well, the people who wrote it knew. They did a lot of work. There's like 600 of the global variables right there. Exactly.

Starting point is 00:22:13 The most important 1,400 lines in all of Python right now. They have string functionality. Then two of the big optimizations from Cinder, that's the Python 3.8 specialization from Meta. One is, this is a presentation by Itmar Osterreicher. So this is the person who sent this in, actually. This is looking at async methods. And if you can be sure it's not actually going to await,

Starting point is 00:22:43 treat it like a regular method. So if you have an async method, you might say, do this, do this, do this. If I already have the value in the cache return else await database call, right? If you already have it in the cache, why do you need to create a co-routine, schedule it on the back on the loop, wait for the loop to get to it and then return just dude just call it like a regular method just give us the answer that's the idea there's some interesting ideas that it might change runtime ordering although i don't know there was any promises of runtime ordering but yeah so that that one's interesting uh also the issue and pr backlog now that we've moved to github apparently there are issues that are still 20 years old that are still open.

Starting point is 00:23:30 And traditionally, the core devs and the triagers and so on have approached these things like, well, should we close this or probably we need to keep it open because it's important for historical reasons. And they're starting to talk about like, this is not helpful for anyone. Maybe our first question is like, why should we keep this open? And if the answer is not clear, just close it. There's a lot of talk about, well, this historical stuff, and maybe someone wants to pick it up. But if it were me, if I got to pick, and obviously I don't, so it doesn't really matter. I would just go, if it's older than two years, just close it. Like there's a script that just says over two years, select all, close. Now let's go through and figure it out. Because at some point, you know, if you've got 20 years of, you should make this change.

Starting point is 00:24:15 Maybe even, maybe these, these things aren't even relevant anymore, you know, or things have moved beyond it or it doesn't make sense in 2022. I don't know. But I'm just mostly what I got out of that article is I'm thankful that I don't have to deal with 20 years of issues and PRs. But also they don't go away if you close them. They're still there if people really want to see them. You can. So I think they should be.

Starting point is 00:24:35 Maybe two years might be a little extreme, but at the very least five or three or something. There should be a number where that's true. That number should be less than 30. And it's a smaller number than 20, right? Yeah. All right. This is a long section. Last thing.

Starting point is 00:24:51 I'll close it out with this. Immortal objects. The path forward for immortal objects. So let me ask you guys this. Can you change none or true or false? No, right? Do you think it's ever going to go away? Are we done using true? And then it's just going to go away like are we done using true and

Starting point is 00:25:06 then it's just going to get garbage collected or reference counted out of memory nope but you know what every time you interact with true and false it's still incrementing its ref count and none is that because it's an object right oh yeah and so this discussion is like isn't there some that just shouldn't be participating in reference counting because they're they're just fundamental to yeah you know like the idea of a class like the the structure of a thing that defines what a class is true false the numbers like the low numbers like there should be some that are not consuming that memory because they don't need to keep track of that section and so on. So anyway, this was the proposal.

Starting point is 00:25:50 Again, it's complicated is the story. But yeah, I do something a little bit like this on TalkPython, the training site. So I've done a lot to tweak the garbage collection around there and really change the defaults of what are the triggers the garbage collection around there and really change the the defaults of like what are the triggers for garbage collection so if i've got this many allocations and so on and one of the things you can do is you can tell it from here on like what has existed up until now freeze that and don't don't look at it when you have to look for cycles right so i just in my app startup when it's a it's kind of imported the things and it's about to start it just says don't look at it when you have to look for cycles. So in my app startup,

Starting point is 00:26:27 when it's kind of imported the things and it's about to start, it just says, okay, everything that you've done to come to life, just don't trick that anymore. Anything else I make from here on out, please clean that up. And it seems it's kind of a super cheapo version, but you still get reference counting, right?

Starting point is 00:26:41 Yeah, that's definitely an optimization. It's not garbage collection. I think it's worth it for some of these immortal objects. Why not? Yeah, I mean, we shouldn't be reference counting on none. Yeah, that's definitely an optimization. It's not garbage collection. I think it's worth it for some of these immortal objects. Why not? Yeah. I mean, we shouldn't be reference counting on none. That's kind of weird.

Starting point is 00:26:49 Unless it slows things down by having like some. It does. That's the thing that's crazy. So over here, they're like, all right, here's the deal. We shouldn't have to worry about this. And so, where was it? Except it adds an if statement to everything right yeah it says the naive implementation of this makes it six percent slower not faster like oh no

Starting point is 00:27:14 it makes sense yeah and we think we can make it only two percent slower it's gonna be slower though yeah well the thing is normally you would just reference count it you just go none plus equals one right or plus plus minus minus but here you're like you have to have a test like if it's an immortal object do this else do that and it's just like that bit in the hot loop of the runtime is just apparently overhead you know yeah for everything so everything you reference has to check to see whether or not it's an immortal object um before it does the reference counting so yeah maybe it has a no op method on it i don't know i think it probably works straight on the the field though all right uh much like highlander alvaro says there can only be one done

Starting point is 00:28:01 all right well trade-off yeah yeah right. Well, trade-off. Yeah, yeah. This is definitely an interesting trade-off. All right. Well, I think that's more than enough for the language writer. But it was really cool that Alex wrote that up and Enmar sent it in because that's a good insight to what's next. Cool.

Starting point is 00:28:19 So it's my turn, right, given that it's still on? Oh, sorry. Yes. Go, Daniel. Cool. So, people in the software community are blessed with many options for doing source control. You've got get SVN, Mercurial, and other historical ones that maybe aren't as well used. But optical engineers, mechanical engineers, electrical engineers, everybody else doesn't

Starting point is 00:28:41 have it nearly as good as the software community. So anytime I see an option for that, it definitely sticks out of my mind. So I don't remember how I found this, but I came upon Allspice fairly recently, which is Git for people who are doing circuits. This is cool. And so it looks exactly like GitHub. You've got version control. You've got all the things you expect to have. It's compatible with some of the common electrical design programs. But it really just gives you the ability to do all these sorts of things that you take for granted if you're in a workflow like software, but that you wish dearly you had for any other discipline.

Starting point is 00:29:18 So when you put something in a source control and you diff it, what do you get? Are you diffing graphics? Are you diffing some sort of definition file that defines the circuit? One of the first thing they have is diff tool because they know that that's kind of one of the big questions, right? Is how do you compare the schematics?

Starting point is 00:29:38 Show it. So they have a way to do it visually and you can look at all the changes and it looks like they're highlighting each commit to whatever change was made on the schematic. Oh, that's cool. Yeah. Oh, that's very cool. Yeah. Yeah. So one potential question would be, well, great, you know, it's nice that you can do that on the

Starting point is 00:29:55 internet. But I work at a commercial company that doesn't want to do that. But they do have both what they have a, they have self hosting, and they have a government cloud version if you're subject to things like itar ear so you can in the same sense the git has an enterprise option allspice also has right like an on-prem self-hosted version yeah so you don't have to give no secrets yes but i have no personal experience with it, but it's very promising and exciting to see somebody trying to come up with better ways to do engineering work besides just software.

Starting point is 00:30:31 You can even configure it to integrate with Tortoise Git, like the Windows Explorer, right-click type of Git. Yeah. So exciting stuff. Hopefully somebody helps out the mechies and the optical engineers as well one day.

Starting point is 00:30:46 Yeah, I mean, there's always large file support but the diff is terrible uh right so usually yeah you're looking at binary files or stuff that's yeah yeah humans are so good at processing images that if you have a visual comparison that that's orders of magnitude better than trying to look at lines uh of your even if it is a plain text file that you can read through yeah definitely um yeah here's your uh xml with its namespaces good luck what what does this mean yeah well cool i like it all right i do too brian you got any extras for us? Yeah, actually.

Starting point is 00:31:27 So I've been busy. I've been kind of like this back stream of Test and Code episodes. So the most recent one that I put out was with Will McCugan. We're talking about rich and textual and textualized. It's a really fun one. But actually, since we talked last Tuesday, I've got four extra episodes that have come out. So, uh, we've got teaching, uh, including testing with, uh, with the web front end stuff. Um, uh, which was, it was kind of an interesting story about like,

Starting point is 00:31:55 basically if you're college level students, but they're new to new to coding, uh, when do you include testing? And, um, and, uh, Carl says right away, not um so also a developer and productivity episode um i think that's oh yeah and a python django rich and testing article so or episode so lots of goodness over on testing code they have a django package, apparently. Yeah, that was just for other, like the CLI, the Django CLI stuff, including rich with that, which was great. But they've incorporated that into the test runner. So the Django test runner can do rich tracebacks, which is pretty cool.

Starting point is 00:32:43 Daniel, you got anything else you want to give a quick shout out? Sure. Adafruit's a well-known company for doing maker electronics. I don't have the links up, sorry. Adafruit's well-known, and they do a good job of focusing at

Starting point is 00:33:00 the first five-minute experience of getting you up to speed with something on electronics. But there are other companies that do the same thing as well so it's going to shout out spark fun uh seed studio and then other companies like open mv who has a focus on machine vision and they're less geared more for the people at the entry level so maybe if you're a little more comfortable with certain things or a little more comfortable you know explain those based on your own things right more specialized maybe for people who are trying to actually or if you go to yeah if you go to adafruit and what you want is out of stock you can check some other places too which unfortunately happens a lot these

Starting point is 00:33:33 days it's yeah those things come and go a lot of demand awesome all right right i do have some reason cool yeah that's right i do have some extra ones but i kind of got a lot so all right let's see i'll go last all right, let's see. I'll go last. All right. The first one is I always love a good documentary on tech stuff. And sometimes these are super cheesy, but there's a documentary called Power On, the story of Xbox, which is a four hour video, which you can watch on YouTube, which I'll link directly to the YouTube video. And it's really good. It's really interesting. Whether you love or hate the Xbox, I honestly don't care that much one way or the other, but it's just an interesting sort of view of like the last 20 years of technology from the sort of the gaming side. So if people are looking for something to watch and they want to spend four hours doing it or spread it out, you that one. This one I took. So recently I released my Git course on sort of a pragmatic introduction to Git. And I decided I wanted to share one part of it with

Starting point is 00:34:33 a broader world. So I released a video called the four reasons to branch with Git. And I put that on YouTube and people can check that out. So it's like an hour long video I posted this week. And then this one comes to us from Jason Percore saying, how cool is it to see Python showing up like right on the front page of various places? So there's this place called EasyPost, easypost.com, which allows you to like do labels and track your labels and stuff.

Starting point is 00:34:58 But if you just scroll down just a little bit, it says, you know what? Why don't you just either buy labels or you can just use this python api right here and it doesn't even sort of if your developers click to reveal the secret you know it's just like no here's your here's your python code for our nice for our company so just kind of a cool little um uh thing for that let's see um. Brian Skin pointed out that the Stack Overflow 2022 developer survey is open for accepting comments, which is cool. And I'm going to put this up here on the screen first. So,

Starting point is 00:35:35 Brian, do you see this? It has all of this stuff. I can't, if I click it, it'll just go away. And this is an image, right? Right here? Yeah. What if I wanted that as text? What if I wanted to somehow grab that? So I've got this app, which I'm going to tell you all about next, um, called text sniper. Watch this. So you can't quite see if I just drag over that, just like you would a screenshot. And then let's see. Um, I need somewhere I can paste this anywhere there. So what I got out of that is check this out. Oh, wow. Isn't that cool? Yeah.

Starting point is 00:36:07 I just control seed from like the picture on my, uh, on my screen and it can do PDFs. It can do screenshots. Like, so for example, if you're watching a video presentation and you see a slide, you're like, Oh, I want to capture those bullet points or that grab it. You got it so that is called text sniper which is super neat all it does it's just like the select region for screenshot that's great and then boom what it doesn't matter what's under it it's just if it's texted ocrs it and then you got it yeah so often like a small restaurant will put their address or their phone number like in an image. Like, come on, I gotta click on that sucker.

Starting point is 00:36:51 I want to put, just drop this, paste it into maps or something. That's right. So I don't know. I think for doing research, if you're like watching videos, you want to get something out of something that's on the screen, like a slide or whatever. This is, this is pretty awesome. And it, it costs something like $11 once. So it's, you know, if it's useful to you it'll be worth it if not then you know it's not they're not it's got to be worth eleven dollars or zero to you um that was like a good ocr app yeah yeah and it's just the the ease of use right not take a screenshot and go find your app it's just like slap drop um okay so last one of my extras uh sam low and um philip sent over allow sam lao sorry and then uh sent over that i had them on to talk about pandas tutor

Starting point is 00:37:36 and they were talking about the challenges of running pandas tutor on the server side and letting people run code but it's pretty limited because you don't want them to hack the various things. You don't want to keep it pretty limited. So they don't take advantage of like your compute resources. So now they just posted a message saying pandas tutor. If you go over here and say, visualize your code,

Starting point is 00:37:56 it'll go and do all these cool visualizations. I know we've spoken about this before, but notice what it says here. I can scroll a little. It says initializing high iododide on Wasm download. Panda's running. Boom. And so all of this is running in client-side Python, which is just.

Starting point is 00:38:16 Wow. Yeah. So we talked about that being one of the topics of the language summit, the Wasm support. And here you have it in action. So I said on the show, like, hey, have you guys considered this? Like, ah, maybe we should. And then like, this turns out to be a great idea. That's pretty cool.

Starting point is 00:38:32 Like the message is run code on the server. That's slower. We just recommend you run it here. Nice. All right. That's pretty neat. Well done, you guys. And that's it for my extras as well.

Starting point is 00:38:43 That's a lot more real than I thought. I guess I thought Pyrodyne and WebAssembly were a little bit further off, but that's like, hey, there's an application right now doing that today. Yeah, yeah. Brian, the anti-gravity high script thing

Starting point is 00:38:55 you showed last week was so cool. Yeah, I didn't even know it was doing that before we showed it, but it's pretty neat. Yeah, yeah. A lot of the interactions are super they're getting starting to be really our we're getting there daniel we're getting there all right how about a joke to wrap it up definitely so we've all been in well maybe

Starting point is 00:39:16 we haven't all been uh we can all imagine being in awkward situations maybe on a a weird date so i don't go on dates really being married for a long time. But imagine that you had. Here's a graphic of a woman who's on a date, like maybe just woke up in the morning after the first time together sort of thing. And the guy who's like sculpted, right? He's like clearly like a super fake, probably a good looking guy, whatever. But he's in the shower and she's like flipping through his phone.

Starting point is 00:39:49 It says when she looks through your phone, but all she can find is fork a child and kill it. Google search for a kill child and fork parent, kill parent with fork, kill parent without killing child. Kill child without killing child. And she's got this face of like

Starting point is 00:40:05 oh what's that sorry those are great yeah she's got this look like I thought it was going so well

Starting point is 00:40:12 and he's a murderer I can't believe it no he's just trying to figure out Linux don't

Starting point is 00:40:17 don't hold it kill child without killing grandchild it's so bad can you do that well i don't know i haven't searched it but i don't have to explain that search if i did search it in stealth that's what incognito most this is totally benign but if somebody sees it out context, maybe they won't feel that way about it.

Starting point is 00:40:49 There's what will get you on the FBI list and then software engineers. Yeah, there's like a Venn diagram of that. Yeah, there's probably a small intersection there. It's probably pretty big, actually. Yeah, it's probably pretty big. Anyway, well, thanks, everybody, for a fun show again. Yeah, you bet. Thanks, Brian and Daniel. It's great to have you here nothing thanks thanks for coming bye bye

Python Bytes - #284 Spicy git for Engineers

Topics covered in this episode: distinctipy Soda SQL Python in Nature Supercharging GitHub Actions with Job Summaries Language Summit is write up out AllSpice is Git for EEs Extras Joke See the ...full show notes for this episode on the website at pythonbytes.fm/284

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.