Python Bytes - #81 Making your C library callable from Python by wrapping it with Cython

Episode Date: June 5, 2018

Topics covered in this episode: * Learning about Machine Learning* Making your C library callable from Python by wrapping it with Cython Taming Irreversibility with Feature Flags (in Python) preten...d: a stubbing library The official Flask tutorial An introduction to Python bytecode Extras Joke See the full show notes for this episode on the website at pythonbytes.fm/81

Transcript
Discussion (0)
Starting point is 00:00:00 Hello and welcome to Python Bytes, where we deliver Python news and headlines directly to your earbuds. This is episode 81, recorded May 25th, 2018. I'm Michael Kennedy. And I'm Brian Ocken. And we have a bunch of great stuff for you, as always. Super excited to talk about it. But before we do, Brian, let's say thanks to DigitalOcean. Thanks, DigitalOcean. Yeah, thank you, DigitalOcean. They're supporting the show by sponsoring this episode and a number of others.
Starting point is 00:00:25 And they're giving you all something awesome as well. $100 free credit if you're a new user at pythonbyst.fm slash digitalocean. More about that later. I'd like to learn some stuff right now, Brian. Learn about learning. We hear a lot about machine learning, and I came some a couple ways to learn it so um i've got a couple topics here first one is a a single page site called hello tensorflow and it's a kind of fun little demo with one machine learning example of a bad guess at the coefficients for
Starting point is 00:01:01 a polynomial and having the machine learning back end, I think it's TensorFlow, figuring out what the real answer is. And it's got like a graph on there where you can watch it zoom or narrow in on it. So it's fun. That is cool. Yeah. So it has a secret formula and it says, here's some equation which you don't know. Machine learning system, go learn how to predict where on that line it's going to be given a few points. And it's like this nice little animated thing that rolls in. I love the real-timeness of it. That's great. And it doesn't quite get it right away, and if you run it again, it's even better.
Starting point is 00:01:41 So it increases its guesses. It's kind of neat. Oh, nice. The other topic I wanted to bring up is I found out that there's both an article about this that we'll link to, but there is a Google-provided machine learning crash course that actually looks pretty slick. It's got like 15 hours of course material and a bunch of lessons and some exercises. So Google has put together a free course on getting started with machine learning. That's really cool.
Starting point is 00:02:09 It's kind of fun. Yeah. So one, it's kind of like gets you interested in the idea of it. The other is like, let's actually teach it to you. Yeah. Neat. Yeah. So remember last week when we talked about Pi System D with Dan Bitter?
Starting point is 00:02:19 Yeah. Yeah. So he pointed out that it was interesting that Pi System D was implemented in Cython, even though it had no real performance issues. I mean, it just asked like, hey, is this service started? And it just delegated to the C API. So I came across an article called Making Your C Library Callable from Python by Wrapping It in Cython by Stav Shamir, which is a really nice short article. Yeah, this is my guess why they use cython for pi system d so cython is known for its ability to increase performance of python code of
Starting point is 00:02:51 course but it's also a really interesting way to sort of bring python syntax into the realm of c directly right and because of that it makes calling c code from what looks like python and then exposing that part that looks like python directly to python really easy so let me give you a quick example suppose i have a function it's called hello it takes a character pointer called name right this is c and then you do printf hello person s name right so super simple it's like a function it takes a character pointer and and returns void. So if I wanted to call that function from Python, I could write just like a couple of lines of code in Cython. So I would say c def external, right define, here's an external function,
Starting point is 00:03:38 void, hello, const car, right, just the, there's the signature of that C function. And then I'd write a function like pi underscore hello. I'd say name colon bytes goes to none and just say hello name. That's it. Not all these crazy extensions or pi objects or whatever. Just like it's basically Python code with type hints or type annotations applied, right? I mean, that's what Cython more or less is. This is a really nice small example of how to get some... I thought I had to go through some, like,
Starting point is 00:04:08 figure out how to link to the DLL and stuff like that. Yeah. Neat. Yeah, and so the final step, you have to actually build, like, you have to compile your Cython code so that it can be imported from regular Python code. And so they provide a setup py, which does that. And you just run Python setup py build, right? And it goes and does its magic.
Starting point is 00:04:30 And then you have it. You can work with it. And by the way, it happens to be fast, even if that's not the point. Okay, I'm going to have to go play with this. It's neat. Yeah, it's cool, isn't it? It's cool and it's easy.
Starting point is 00:04:38 Yeah. Yeah, it's way better than like flipping into C land and doing all sorts of stuff. I think if your point is only to consume this code, maybe even just to write regular Cython code. There's a lot of good things going on there. Yeah. Awesome.
Starting point is 00:04:49 Cool. So what's this next one you got? Feature flags? Yeah, feature flags. And I can't remember who was the – that was – now we're going back in time about a year or so, but was it Instagram that talked about – Yeah, it was Instagram that moved from Python 2 to python 3 on their django deploy uh deployment which is i think the largest django deployment in the world
Starting point is 00:05:10 on the same branch and without anything right just by using feature flags i don't remember if they used feature flags in that or not but i know a lot of people that a lot of teams that do have a model of uh of merging to the master frequently or just working off master for everybody often use uh feature flags and i know how to do that in c++ but i wasn't quite sure i could have hacked together something for for python but this is a nice article about really how to do feature flags in python well, and why you would do it and how to do it. So a few of the benefits they talked about is improving Teams' response time to bugs because if you add a feature with a feature flag,
Starting point is 00:05:53 you can just turn it off by flipping the flag if you need to without having to redeploy everything. I really like this idea. It's great to be able to just keep everything in the same code base and also keep the database schemas in sync, which is nice as well. So a lot of cool stuff going on here. work. And you can migrate user groups with A-B testing or group splitting or however you want to migrate the feature in. But then it went on to talk about some of the ways to implement it nicely so that it's a maintainable flag system, how to measure your success with different analytics, and then using third-party tools to make your flag support clean and not reinventing the wheel. And other people have figured this out.
Starting point is 00:06:49 And then also just at the end a comment to say, you know, once you've really decided that a feature is in place, you have to go through and do feature flag cleanup. So make sure that you remove the flags and have the features be permanent when you're ready to have them permanent and clean up your code base. So it was just a nice write-up for this. Yeah, it's really nice. I like it. And they have maybe one of the best visualizations of flat is better than nested with like some kind of Mortal Kombat type character. It's a crazy nested if statement.
Starting point is 00:07:20 That's the cleanup conversation, right? Like don't do this. Yeah, and the example of how not to do feature flags. Yeah, definitely. Exactly. Don't do this. Yeah, it's quite cool. Nice. All right. So speaking of quite cool, it's quite cool that DigitalOcean is sponsoring this podcast. And I want to just tell you guys quickly about them. So they're a hosting company. They've got data centers throughout the world. And I think one of the cleanest, nicest ways to create a set of virtual servers and get them up and running and configure them. So if you want to create one that's already
Starting point is 00:07:51 pre-configured for some infrastructure like Disqus, you can just click a button and say, create me a Disqus virtual machine based on Ubuntu, whatever version you're looking for, or create a fresh one instead of PowerView-like. So they provide a lot of the infrastructure for us. We actually pay for it, but they are the people we pay for getting you this podcast, which is pretty awesome. They've been really good. We're happy customers.
Starting point is 00:08:14 And so if you want to be a new happy customer, you can get $100 to try them out at pythonbytes.fm slash digitalocean. And that's for new customers. Check it out. And hopefully you create something cool and run it there. Nice. Hey, Brian, I got one that I think you're going to like.
Starting point is 00:08:31 Okay. It's about testing. I like testing. Recently, I had a TalkPython episode on the release of PyPI and the inside story of how that got revised. And finally, PyPI.org is the official thing, not like a weird, scary place that also matches to the same database. So that's really awesome.
Starting point is 00:08:47 And I can't remember who said it on the show. Sorry, because there were three folks. But one of the libraries brought up was Pretend, which is a stubbing library. Neat. Yeah. So stubbing is like mocking, but it's different. How's that, Brian? You know more about that than I do.
Starting point is 00:09:01 Oftentimes in mocking, you want to check the behavior of the code. If you're interacting with some system that's not really there, you want to make sure that you've called it in certain ways or you called it a certain number of times or the order of calls, things like that. And stubbing is really like, I just want to have my code be able to call something and have the return value be like some pre-canned data. So it's more about pre-canning and then about the behavior. I see. So with mocking, maybe I'm going to say, I'm going to call this login API and I want to
Starting point is 00:09:34 make sure that it checks that my password is correct or something like that. And that would be a mocking thing. Whereas for stubbing, I just like, I need it to give back a password so it doesn't crash because it's like a, you know, none type attribute error type of crash if, if nothing comes back. So we got to give it something. So let's create a stub to do that. Stubbing also is a great way to do things like, um, if there's error conditions, like when, when you're connecting to a third party that goes out to some, if you want to like, if the server crashes or something, you'll get an error code. So how do you, how do you simulate that? You can't go out and crash the server. It's a really great way to, to pretend bad things happen. Yeah. So let me give people a sense of the API. It's real simple. So from pretend import stub,
Starting point is 00:10:18 and then you just say stub and say like, here's a function name equals some Lambda. And now you can just start, you pass that around. If somebody calls that function, it returns the value the lambda returns. Done. It's like, I don't think that it could really be simpler, to be honest, right? Well, that's one of the reasons why it's pretty cool is that it's simpler than using mock. Mock can do this too, but this is simpler. Yeah, really nice.
Starting point is 00:10:40 Yeah. You could probably use that with Flask, couldn't you? Yeah. You could probably use it with Flask. Yep. And my next topic is a surprise to me. I forgot that I put this down. But one of the things I was, I got out of PyCon was I got to sit next to one of the people that at dinner that works on the Flask project. And he reminded me that they just went through and rewrote a bunch of the Flask's tutorial. So I went through and took a look and yeah, the official Flask tutorial has got a lot of updates.
Starting point is 00:11:11 It's the code that goes along with it's been updated and just everything's been simplified, updated. It's a little cleaner. It includes a, I don't know if it had this before, but it includes a section on testing, which highlights PyTest, of course, and coverage, which is good. And one of the things I learned also with that discussion was Flask is a part of the – a pallets. I'm not sure what their entity is really, but pallets is a collection of people that work on a collection of projects. And it's some pretty important stuff. We've got Flask, Click.
Starting point is 00:11:47 It's dangerous. I'm not sure what that is. It's a request validation foundation of Flask. Oh, okay. It's dangerous. I already said that, but Jinja and Jinja 2. Is there a Jinja? Is it just Jinja 2?
Starting point is 00:11:59 I've not seen any Jinja in the wild lately, but there probably was at some point. Okay, and then uh markup safe which is an html markup safe string for python library and then workzug i don't know how to pronounce that workzug with a v everybody relies on a lot some of these things even if you don't use flask and the palettes uh project has a donate page now so if you don't use Flask. And the Palettes project has a donate page now. So if you want to donate, you can donate through their donation page.
Starting point is 00:12:31 It's pretty neat. It's really nice. So we've had a lot of news around Flask lately. Flask went 1-0 a while ago. I talked to one of the guys, I'm sorry, I'm so forgetting his name, because I met so many people, but who is basically responsible for that whole progression.
Starting point is 00:12:45 David, I believe. Anyway, he's like, I know people think because the Flask went 1-0 the week after you guys did the Zero-Ver episode. Actually, that was in the works for like a year. I would love for you to be doing that, but it was actually just a coincidence. So anyway, glad to see you go 1-0. But yeah, this looks like Flask is getting some renewed love, which is good. Yeah, and I didn't know Qlik was by the same people. I use Qlik all the time, so that's pretty neat.
Starting point is 00:13:12 Yeah, for creating CLIs, very nice. All right, so let's round it out with a little bit of an internals look here as well. I feel like a lot of stuff I'm covering this week is like deep in the guts with the Cython. And if you're not doing Cython, you're doing regular Python, then you're operating in the bytecode space. So do you think people would be surprised if you told them that Python is compiled? Yes. A lot of them, I think they would. But it's not compiled to machine instructions or even JIT compiled unless using PyPy. But it's compiled to bytecode. So those PYC files, those are like the instructions
Starting point is 00:13:45 to the Python virtual machine, not instructions to your processor, but they're still compiled and there's still this bytecode. And so understanding it's pretty interesting to just know like how the internals of Python is working. So there's this nice article called an introduction to Python bytecode.
Starting point is 00:14:01 So if this bytecode concept is kind of new to you or you just want to play around with it a little, check it out. It's pretty accessible. So I feel like there's a lot of hello world examples in my topics. So back to hello. So they have a function, def hello,
Starting point is 00:14:16 and it just prints the static string hello world, right? It's just, okay, well, what does this actually mean? And then they show you all the bytecodes. It's okay, we're going to load the global, which is the print function. We're going to load the constant, which is hello world. And we're going to call the function that is on the stack. So CPython uses a stacked based virtual machine. And so they like load these things under the stack. If you have like a function that takes two arguments, they might load two arguments on the stack and then call the function and things like that. So that's how your Python
Starting point is 00:14:46 source code gets into executable form. And then these steps are actually sent down to the C Python runtime and this giant wild loop with a switch statement that's like literally 3000 lines of C code that says, what's the bytecode? Let's go do that, you know? So it's pretty interesting. And they talk about how you can take your Python code and look at this. You just import dis as in disassembly and say dis dot dis and you pass it like a callable or something like that. It'll show you the bytecode instructions that make it up. So it's pretty nice. New phone who dis.
Starting point is 00:15:17 You can't actually just look at the PYC files though, right? No, they're like bytes. That's why you need this dis thing. I mean, I guess if you can like parse the bytes, you but it's not strings i don't think okay yeah i mean the pyc files are basically the compiled steps from your hello world text into the byte code instructions in terms of bytes and then that's why those cache things are laying there the next time you hit that app right it's just going to go and say well well, let's just load up that PYC so we don't have to reparse and validate it. Yeah, this is kind of neat.
Starting point is 00:15:50 I'm definitely going to go check this out because I just want to know more about this. It seems like something I should know about, even though I probably don't need to on a daily basis. Yeah, I mean, I'm not sure how helpful it is, but it's helpful in your conceptualization of how stuff works, I think. Yeah, definitely. Very cool. Yeah, a little bit deeper down into the... Is that the red pill or the blue pill that takes you farther down? It's the red pill, right? I don't remember. Awesome. Well, anyway, it's definitely worth checking out if you haven't played with ByteGo before. It's a really nice, simple way to get introduced to it. Brian, you got anything, any other
Starting point is 00:16:21 news you want to share with everyone this week? I don't think so. Do you? No, I'm all out of news this week other than the ones I found. So I just want to say thank you. Thank you for being part of this episode and sharing everything with everyone. Thank you. Bye. Yeah, you bet.
Starting point is 00:16:33 Bye. Thank you for listening to Python Bytes. Follow the show on Twitter via at Python Bytes. That's Python Bytes as in B-Y-T-E-S. And get the full show notes at PythonBytes.fm. If you have a news item you want featured, just visit PythonBytes.fm and send it our way. We're always on the lookout for sharing something cool.
Starting point is 00:16:53 On behalf of myself and Brian Auchin, this is Michael Kennedy. Thank you for listening and sharing this podcast with your friends and colleagues.

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.