Python Bytes - #120 AWS, MongoDB, and the Economic Realities of Open Source and more

Episode Date: March 5, 2019

Topics covered in this episode: [play:0:53] The Ultimate Guide To Memorable Tech Talks [play:3:56] Running Flask on Kubernetes [play:10:51] Python server setup for macOS 🍎 [play:12:52] Learn Eno...ugh Python to be Useful: argparse [play:14:56] AWS, MongoDB, and the Economic Realities of Open Source Extras Joke See the full show notes for this episode on the website at pythonbytes.fm/120

Transcript
Discussion (0)
Starting point is 00:00:00 Hello and welcome to Python Bytes, where we deliver Python news and headlines directly to your earbuds. This is episode 120, recorded February 28th, 2019. I'm Michael Kennedy. And I'm Brian Ocken. And this episode is sponsored by DigitalOcean. Check them out at pythonbytes.fm slash DigitalOcean. Lots of good stuff with that URL, but more about that later. Brian, welcome back to the show, man.
Starting point is 00:00:22 Oh, thanks. Nice episode last week. Thanks, yeah. We went to Seattle Nice episode last week. Thanks. Yeah, we went to Seattle and we all had fun and sadly you were knocked out, but understandably. Man, I sent it to everybody I knew, my daughter and a whole bunch of people and said, at least listen to the first bit of it because this is why I enjoy working with Michael. It was so nice of you to do a shout out to me. That was nice. Thanks. Yeah, that's really great. Yeah, you're welcome. It's the least I could do. Wish you were there, but that's okay. Speaking of people who were almost there, at least was at the conference who I got to spend a lot of time with, but wasn't
Starting point is 00:00:51 actually at that live recording, Nina Zinkarenko. How about that? That's our first, the source of our first item, right? She is. Yes. How could I resist? She put together this wonderful resource for people. So what I'm talking about is the ultimate guide to memorable tech talks. And it is, she said it was going to be a blog post and it turned into what she calls a book, but it's a seven articles on Medium. And it's a seven part series that she covers things like choosing a topic, writing a talk proposal, tools that she uses, planning, writing it, writing your talk out, practicing it, and delivering the talk. And it's a phenomenal resource. I'm not done with it, actually. I've
Starting point is 00:01:30 read up through the tools section, and I'm looking forward to the rest of it. But one of the things, I mean, it speaks right to me because I got into podcasting partly to get better about this public speaking thing, and it's difficult. And especially for, I think, tool nerds and introverts and stuff like that, jumping into the tools and in the little details, you can lose sight of stuff. So one of the things she noticed, she quote from the article is, I noticed I'd procrastinate on making the slides look good instead of focusing my time on making quality content. It's easy to get sucked into that, yeah. Yeah, totally.
Starting point is 00:02:07 When I should have been, like, I've given two talks and many, many talks informally at work and stuff, but two conferences. And I spent so much time just tweaking the different slides instead of practicing it and making sure I understand it well enough to even throw the slides away and just do the talk. And so I think focusing on planning and leaving all that time enough to get all that.
Starting point is 00:02:31 This is actually the series that I wish I had when I started getting into trying to do talks. Because either you have out there, there's either stuff that doesn't have enough information. It's just a quick gloss over. Or it focuses on one aspect or on the other end you've got courses and books this is just a series of blog posts that you can you know staple together with one staple so yeah it's good not the big heavy super duper like binder staple but like a regular one right like the red one from office space yeah yeah just one of those yeah now this is a great series, and nice work, Nina.
Starting point is 00:03:06 Good pick, Brian. I think it'll help a lot of people. I like that it's writing, you know, she covers writing the talk proposal because it's great to get into public speaking, but you have to get accepted to get into public speaking a lot of times. Although, although, at the bigger conferences,
Starting point is 00:03:20 say PyCon, you can always do an open space, and that's kind of like a dip your toe in the water because you don't have to lead the whole conversation but you kind of are sort of leading it in spirit. Yeah, and then you can go through all of these steps for just to get ready and you only have to be up there for, I don't know how long the lightning talks
Starting point is 00:03:36 are, but they're not long. Yeah, for sure. Yeah, lightning talks are five minutes. Open spaces I think are 25, but you don't do the talking the whole time. We're definitely going to, you know, as you spoke about the live recording. We're definitely going to, as you spoke about the live recording, we're definitely going to do more live recordings at the main PyCon as some open spaces, if nothing else. Yeah, definitely.
Starting point is 00:03:51 I wanted to interview a bunch of people for stuff. Bringing the mics. Okay, super. So the next one is something I've been digging into lately. Have you done anything with Kubernetes? No. Let me take a step back. Have you done anything with Docker?
Starting point is 00:04:02 No. So these two things are sort of stacked on top of each other, and they often get complicated. So Docker is a little bit of a complicated thing if you haven't done a lot of deployment and DevOps-type stuff because it's all about configuring Linux-isolated machines, if you will. And then, in order to actually use them properly, really, with scale-out and failover and clusters and connections between them
Starting point is 00:04:24 and all that, like a Docker container for your web app, one for the database, you really need some kind of orchestration, which is Kubernetes. So there's a ton to learn about all this stuff. Luckily, Michael Herman from testdriven.io, formerly of RealPython, I did a really nice write up here. He has a course on testdriven.io. And this is like an extracted, long example tutorial to get started with Flask on Kubernetes. Wow, that actually sounds neat. Yeah, it's actually pretty approachable. I mean, I said it's a complicated topic. He really lays out the core ideas.
Starting point is 00:04:59 And if you ask Pocket, it says reading time is 16 minutes. So it's pretty hefty, but it's not book-sized. It's not super, super long. Of course, the little steps that it says take more than just one line or whatever. You've got to let it configure a cluster or something. So there's little wait periods, but really nice. It talks about how to basically get a Vue.js front-end, Flask back-end app that talks to a database, Postgres in particular, up and running on a Kubernetes cluster. So kind of microservice style, which is pretty cool.
Starting point is 00:05:28 And I'll just run off some of the goals so you guys kind of know what you'll get out of it. So explain what a container and container orchestration is, pros and cons of using Kubernetes, over, say, things like Docker Swarm, all the primitive concepts of Kubernetes, node, pod, service, deployment, etc. Spinning up a Python app, which is nice. Locally with Docker Compose, there's a nice little utility called Minikube,
Starting point is 00:05:50 which lets you basically in just a couple of command lines create a Kubernetes cluster pre-configured on your machine, which is a super pain. So that's really nice that you can just install that thing and have it go. Stuff like that is really quite nice. Oh, I that thing and have it go and stuff like that.
Starting point is 00:06:05 It's really quite nice. Oh, I'm definitely going to check this out. Plus, like in 16 minutes, I can now put like 75 new keywords on my resume. Oh, yeah. This is dense in buzzword bingo hits. Yeah. Nice. You and I both picked something that's a little bit on the controversial side of the world and open source and the community, you go first. Okay. So this is just, I think it's still playing out, but
Starting point is 00:06:30 in the end of January, maybe they announced it earlier, but on the Travis CI blog, they announced that Travis CI is to join Adara, Adara, I don't know. Basically, they got bought by a company. And frankly, I don't really care who owns what. But then in February, I started seeing the hashtag TravisAlums on Twitter. And they looks like, I mean, from the outside looking in, it looks like they're laying off a bunch of engineers. I don't really know what's going on because they're not really talking about it. There hasn't been anything else on the GitHub or the Travis CI blog to say what's going on. So I've used Travis for running tests on Python projects. So I wanted to, I thought, you know, I've wanted to do this anyway, so now's a good time to start looking at if I've got source code
Starting point is 00:07:20 on GitHub, I want to use the, and I want to use testing somewhere, and it isn't Travis, where would I do it? And so right now I'm trying to use GitLab and Azure pipelines with the help of some other people to try to get those running on a couple of projects. But there's a lot more. I mean, GitHub lists 17 different options for continuous integration that you can hook up. Yeah, that's a big deal, right? Like Travis has been kind of the go-to, plug it into your open source public GitHub repo
Starting point is 00:07:49 and just let it turn away and do all the checks on PRs and things like that, right? They claim that our experience is not going to be different. However, with less engineers, it seems like it might stagnate and stuff. Yeah, something that you also talked about here is Azure pipelines, Azure build pipelines, and that definitely seems like it's getting a lot of traction. I've heard of some major open source projects moving over to that, either partially or entirely.
Starting point is 00:08:18 I'm dropping a few links in here. One of the links is a couple things from Anthony Shaw. Of course, it wouldn't be a show without Anthony. One of them is an article that he writes about Azure pipelines with Python, by example. And also he's got a PyTest plugin to help with testing on Azure. And then one of the people that's been helping me out is, I'm going to get his name wrong, but Anthony Sotile, Sotile? Sorry, Anthony. But he's got a whole bunch of different
Starting point is 00:08:47 Azure templates on his GitHub repo that look neat. Yeah, cool. And as far as I understand it, Azure Pipelines are free for public repos and things like that. So it's interesting. All the things I'm going to try are things that have reasonable free
Starting point is 00:09:04 levels for open source projects. That's kind of an interesting trend that's been going on. There's a try are things that have reasonable free levels for open source projects. That's kind of an interesting trend that's been going on. There's a lot of things that are free for public open source repos, but paid for private stuff. And that seemed like a pretty good balance to me. Having the GitHub go to allow private repos is interesting. And I'm not sure what drove the Travis change, but yeah, anyway. Yeah, I'm pretty sure GitHub was pressured by Bitbucket,
Starting point is 00:09:29 but I don't know anything about the roots of the Travis one. I do have more good news around DigitalOcean for you and all the listeners. So they've traditionally had these high compute instances called droplets. They're virtual machines. And they've had memory heavy ones but the memory heavy ones didn't have dedicated cpus like you did it shared with uh the vm host right so it
Starting point is 00:09:53 couldn't be guaranteed of like consistent workloads on that thing so now they're announcing a general purpose droplet that is a blend of dedicated cs and a wide range of memory configurations. So basically you can get quite a bit of RAM. I think the lowest one is 8 gigs and it goes up to either 64 or 128 gigs of RAM and a whole bunch of dedicated CPUs. So it's pretty cool and they talk about some good examples of using it would be like web applications,
Starting point is 00:10:22 hosting like an e-commerce site where latency matters or a medium-sized relational postgres or no sql like mongodb database because you want your database to be fast right like that's the heart of your app even if you scale the web front end and other sorts of analytics and so on so you can check that out it's under limited availability but you can go and like click a button say i want to try this so check them out at pythonbytes.fm slash digitalocean. It really supports the show. It keeps us going each week.
Starting point is 00:10:49 Yes. Very cool. Yeah, thanks. So speaking of infrastructure, it's cool to be able to set up your code and your websites and all of your infrastructure to run on Linux. And you might even do that on Docker
Starting point is 00:11:01 and you might even put those Docker containers on Kubernetes. But a lot of us are not using Linux as our dev machine, right? A lot of us are using Macs, at least the ones you'll see walking around the conferences. So this next item is called a Python server setup for macOS. That's a cool little Apple emoji. Nice. Yeah. So if you want to run like your Nginx and your MicroWhiskey or your G-Unicorn or like
Starting point is 00:11:23 your production software stack, but you want to run it locally on macOS, here's a little guide on how to do that. So it basically takes you through setting up Nginx, having Nginx serve your static assets. They're using G-Unicorn and Flask, but you could pretty easily swap that out for MicroWhiskey and whatever else you want to use.
Starting point is 00:11:42 And then having Nginx be the front-end proxy for a scaled-out G-Unicorn backend. So it's pretty cool. Now, I went through this recently, and not all the commands seem to be working, at least on my system. I don't know why, what they're going through. Maybe I missed a step.
Starting point is 00:11:57 But even if the commands don't all exactly work, I think it still is a pretty good guide, and it's on GitHub, so if you find mistakes, open an issue. It's not super popular, but I think a lot is a pretty good guide and it's on GitHub. So if you find mistakes, open an issue. It's not super popular, but I think a lot of people will find it helpful if they're trying to develop in something closer to their actual configuration. Okay. Is this more for a development environment sort of thing? Yeah, exactly. Because normally what you do is you get something like Waitress or you get some other built-in Wimpy, non-production, full Python-based web server.
Starting point is 00:12:31 And then all the requests go through your Flask or Pyramid or whatever. But in production, it doesn't work that way. You have Nginx doing SSL and doing static assets, and then it only passes the Python requests over. There's quite a bit of difference. So if you want to have sort of a QA local thing on your Mac, here you go. Oh, that's nice. Okay, well, I'll check this out.
Starting point is 00:12:50 What do you got for the next one? When I started using command line interface stuff, I used to click. So I did a quick search for different ways to build a command line interface. Meaning, like if I, not an interactive thing, although I know that those are a thing, but just if I want to, if I've got a script that I'm writing and I want to take values in,
Starting point is 00:13:11 it's parsing the arguments passed into the script or whatever. Click is great. I love it, but it's something extra. It's not built into Python yet. It's an extra dependency. That's fine for a lot of times, but sometimes you just want to like a couple different parameters that somebody could pass in and so the built-in arg parse is part of the standard library and that would be good to use yeah it's cool and it's better than just going to you know sys.argv and going after an array of whatever right you get a little more help than that right i actually didn't realize that arg parse was so easy so there's um
Starting point is 00:13:44 jeff hale put together an article called learn enough python to be useful arg parse help than that. Right. I actually didn't realize that argparse was so easy. So there's Jeff Hale put together an article called Learn Enough Python to Be Useful Argparse. That's a kind of long title, but it's a tutorial on argparse. And it's that I couldn't find a good intro guide for argparse when I needed one. So I wrote this article. And I got to say, it's a very nice, quick introduction. So if you want to throw some arguments on an application, try this. It definitely seems like a good idea. There's certainly times where I'm like, you know, I just don't feel like going to all the formality. I just need to know, basically, did they pass some simple thing here?
Starting point is 00:14:16 Yes or no. Right. And this is cool, especially for my little proof of concept ideas. Yeah. Yeah. So it comes built in, which is, yeah, really nice. Nothing to install. So you don't want to necessarily take on click
Starting point is 00:14:28 and make people go through virtual environments and pip and all that. If literally this is the only, that would be the only thing, right? So here's another good chance to not do that. Oh, yeah, that's a great idea. As if you don't already have any other dependencies than, yeah, using this.
Starting point is 00:14:42 Exactly, because that'll definitely increase the challenge of using your code. All right, so you got to go first with Travis CI. I have another one that's a little bit of a sticky situation here, but I think it's really interesting to dig into. Okay. This is actually from January 14th, actually. Let me look at the AWS blog and see when they published it. January 9th. So this is something I've been wanting to talk about for a little while, but it's involved. There's a lot of moving parts and a lot of layers to this. So I didn't want to just give it like a super peripheral sort of skim the surface type of thing. So here's the deal. I'm talking about an article by a guy named
Starting point is 00:15:20 Ben Thompson who runs Stratechery, which is a really great resource for understanding the business side of software. So super, super interesting work that he's doing there. And this one's called AWS, MongoDB, and the Economic Realities of Open Source. Yeah. And he also runs a podcast called Exponent. And Exponent is sort of the audio side.
Starting point is 00:15:42 He does this with another guy named James. And I'm linking to an hour-long episode he did on this called Inverted Pyramids, which talks about demand and building platforms and things like that. So really interesting interplay between open source, cloud computing, and so on. But it focused on this disagreement at a business level between MongoDB and AWS. Okay, so MongoDB, they make databases, right? They make a document database, and they have kind of a unique API, and they are definitely the most popular
Starting point is 00:16:11 of the document databases. Regardless of whether you like MongoDB or not, certainly they are well-known and well-used in that space, right? In the DocumentDB space. If you're not doing that, you're probably doing Postgres in the Python world. But AWS wanted to have a MongoDB as a service type of offering, like they have RDS and other managed database options, right? They want to have basically a MongoDB compatible one. But MongoDB, the company, their business model is basically three things. They sell an enterprise version of the on-premise software.
Starting point is 00:16:48 That's kind of unrelated to this. They have a free community version. And then they sell this thing called Atlas DB, which is MongoDB as a service. And what you do is you go to MongoDB Atlas and you say, I'd like to run this on AWS, or I'd like to run it on Azure or on Google's cloud. So that's MongoDB's business model. AWS just wants just to have a service. So MongoDB, who owns the IP to like the licensing of MongoDB, the server, changed it to something called AGPL. Have you heard of AGPL? Well, I think, but I don't remember what it is.
Starting point is 00:17:22 You've heard of GPL, right? Basically, if you use this software in yours, then you must make yours open source. That's true when you directly depend on it. But what if it's hosted and managed by Amazon and you just talk to it over the network? You're not interacting with it. You're not using it, right? Well, AGPL is like GPL plus network. So if you access this software over the network, it's like the GPL applies to you.
Starting point is 00:17:47 Oh, weird. Isn't that weird? So basically what that means is AWS cannot have a hosted MongoDB because if they do, it triggers this AGPL and like AWS becomes open source or something crazy like this, right? Like the consequences are too high
Starting point is 00:18:02 and like Apple's banned it from the app store. Google has banned it from all the AG, any AGPL software basically is not allowed within Google. It's really not something these cloud companies are interested in having interactions with, right? Because it takes the GPL bit and doesn't talk about shipping stuff, but accessing over the network. So that brings us to the interesting part. So what does AWS do? Not just go, okay, great. We keep running Atlas. No, AWS says today, they said this on January 9th. Today, we are launching Amazon DocumentDB with MongoDB compatibility, a fast, scalable, and highly available database that's designed to be compatible with your existing MongoDB applications and tools. It's purpose-built.
Starting point is 00:18:47 It has all this replication, et cetera, et cetera. And it's all for your production scale MongoDB workloads. So they took the exact API of MongoDB and rebuilt a brand new server and service from it, from scratch. Okay, interesting. They didn't do it on the latest one because the latest one has the AGPL. So they went back to 3.6. The latest one is like 4.05 or something like that.
Starting point is 00:19:05 So they went back to the latest one, the newest one that doesn't have this restriction. They mimicked that, which should be good enough actually for most people, I would guess. Yeah. Weird. Is that interesting? You change your license so we can't have it, so we literally copy your API byte for byte on the wire identical and rebuild a new service from scratch on it. Okay. Both sides are kind of being sneaky and whatever.
Starting point is 00:19:30 Yes. I feel a little more kinship towards the MongoDB Inc. side because they developed this whole thing from scratch and they built it up and they got it popular. But yeah, it's definitely an interesting tit and tat. But what is the AGPL to all of Mongo then? What if I build my website? Yeah, the community edition is separate for like your self-hosting somehow.
Starting point is 00:19:51 But there's like clauses about accessing it over the network. Basically, it's, you know, very much like the GDPR is like mostly built to fight Google and Facebook. Right? This, my understanding is like basically this is to fight the cloud providers. Not meant to interact with regular people just running the community version. Okay. and Facebook, right? This, my understanding is like, basically this is to fight the cloud providers, not meant to interact with regular people just running the community version. Okay, but regular people are running on clouds too. I know, it's quite interesting.
Starting point is 00:20:12 I just would not want to pick a fight with one of these big players, but whatever. Yeah, so I encourage people to both, maybe first listen to the podcast and then read the article if you want to go all in, but you can skim the article. You can't really skim the podcast. I know the article if you want to go all in. But you can skim the article. You can't really skim the podcast.
Starting point is 00:20:29 I know some people do like 2.5x, but those people are crazy. It melts my brain. I can't do it. I do 1.3. Yeah, that's a pretty good balance. There's layer upon layer. Maybe I'll come to the conclusion real quick. So it says,
Starting point is 00:20:39 thus we have arrived at a conundrum for open source companies. MongoDB leveraged open source to gain mindshare, right? Started as open source. MongoDB Inc. built a successful company selling additional tools for enterprises to run MongoDB. But more and more enterprises
Starting point is 00:20:54 don't want to run software easier or not. They want to hire AWS or Microsoft or Google or some other cloud provider to run it for them because they value performance and scalability and all that. So it leaves MongoDB pretty much like a little bit outside the value chain, which is interesting. And there's just a lot of interesting trade-offs
Starting point is 00:21:13 for open source VC-funded companies and traditional monetization strategies are looking harder and harder in the face of cloud computing companies. They just go, that's a cool API, we'll do that. I told you, lots of layers. I don't know, it takes a lot of pondering. I really don't know where I am.
Starting point is 00:21:30 But I think it's a big deal, honestly. It is a big deal. For the rest of us, the topic is important. Document databases are kind of amazing, and I think that we need to push those forward. And having legal structures trying to get in the way of making this just better for everybody.
Starting point is 00:21:48 I know that everybody that started it should be able to make money, but also it needs to move forward. So yeah, there's lots of sides to this. There are a lot of angles and a lot of sides, and Ben does a super job covering it. So anyway, link to that if that sounds interesting. Check it out. It's not exactly the newest of news, but I think it's big news still in the open source world. Okay.
Starting point is 00:22:09 All right. You got some extra stuff for us this week? Just that I'm so terrible with names. But the people that do the Teaching Python podcast, I just released an episode of Testing Code where I interviewed them. Yeah, I saw that come out. That's great. Yeah. So that was cool.
Starting point is 00:22:24 Do you have any extras? I have a couple of extras, just mostly follow-ups and some news. So the folks running Pi Texas in Austin, which sounds like a really fun place to be if you couldn't come to Pi Cascades in Seattle, Pi Texas 2019 is going to be April 13th and 14th. The registration is open
Starting point is 00:22:42 and I'm linking to their page to see if it looks pretty cool. That's one. The other is the article we covered from Anthony Shaw last week. 13th and 14th registration is open and I'm linking to their page and stuff. It looks, looks pretty cool. That's one. The other is the article we covered from Anthony Shaw last week, apparently it was like two years old and somebody had sent it to us. So it felt like, oh yeah, here's a new article from Anthony. So we covered it as if, without any caveats. So Anthony Shaw, sorry, we dug up your old, your very far past predictions of the future and then presented them from today, but they're still good. I thought it was quite a good article and it stands the test of time pretty well. And finally, remember when I talked about Rust Python and I said, the reason this makes me super, super happy, Brian, is that it enables this WebAssembly future of Python,
Starting point is 00:23:20 because Rust has a very strong WebAssembly story. Well, on that episode page, one of the core developers posted a comment, said, thanks for covering this. By the way, rustpython.github.io slash demo. Yeah, that's it running in the web. The CPython built on Rust. Well, Rust Python running the web under WebAssembly. So if you open that bad boy up, and boom, it's up and running.
Starting point is 00:23:48 You can go and type stuff. There's not much of a standard library. That's something we already covered previously, so it's kind of mostly a language level thing. But here it is. Two megabytes WebAssembly binaries running in your browser. It's pretty cool. That is really pretty cool. Yeah.
Starting point is 00:24:03 Alright, so that's all of it for my extra stuff as well okay you got a joke for me people are enjoying this joke segment i'm enjoying the joke segment i do not have one okay i got two i'll see what you think of these okay okay why was the developer unhappy at their job i don't know why they wanted arrays a rr ays arrays that's pretty bad that's lists for all the Python people. Yes. Where did the parallel function wash its hands? I'm not sure where.
Starting point is 00:24:30 In async. A-S. Was there a line or did it have to wait? That's pretty good. Yeah, these are bad. These are pretty bad, but here they are. No, it's good. I have kids, so I'll retell these. these they won't understand it but that's all right you know but they'll be in like the style in which they should appreciate they just the jokes just won't make sense yeah anyway awesome all right well uh welcome back and uh thanks for doing the
Starting point is 00:24:59 show with me today yeah thank you you bet thank you for listening to python bites follow the show on twitter via at Python Bytes. That's Python Bytes as in B-Y-T-E-S. And get the full show notes at PythonBytes.fm. If you have a news item you want featured, just visit PythonBytes.fm and send it our way. We're always on the lookout for sharing something cool. On behalf of myself and Brian Ocken, this is Michael Kennedy.
Starting point is 00:25:21 Thank you for listening and sharing this podcast with your friends and colleagues.

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.