Python Bytes - #120 AWS, MongoDB, and the Economic Realities of Open Source and more
Episode Date: March 5, 2019Topics covered in this episode: [play:0:53] The Ultimate Guide To Memorable Tech Talks [play:3:56] Running Flask on Kubernetes [play:10:51] Python server setup for macOS 🍎 [play:12:52] Learn Eno...ugh Python to be Useful: argparse [play:14:56] AWS, MongoDB, and the Economic Realities of Open Source Extras Joke See the full show notes for this episode on the website at pythonbytes.fm/120
Transcript
Discussion (0)
Hello and welcome to Python Bytes, where we deliver Python news and headlines directly to your earbuds.
This is episode 120, recorded February 28th, 2019.
I'm Michael Kennedy.
And I'm Brian Ocken.
And this episode is sponsored by DigitalOcean.
Check them out at pythonbytes.fm slash DigitalOcean.
Lots of good stuff with that URL, but more about that later.
Brian, welcome back to the show, man.
Oh, thanks. Nice episode last week.
Thanks, yeah. We went to Seattle Nice episode last week. Thanks. Yeah,
we went to Seattle and we all had fun and sadly you were knocked out, but understandably. Man,
I sent it to everybody I knew, my daughter and a whole bunch of people and said, at least listen
to the first bit of it because this is why I enjoy working with Michael. It was so nice of you to do
a shout out to me. That was nice. Thanks. Yeah, that's really great. Yeah, you're welcome. It's
the least I could do. Wish you were there, but that's okay. Speaking of people who were
almost there, at least was at the conference who I got to spend a lot of time with, but wasn't
actually at that live recording, Nina Zinkarenko. How about that? That's our first, the source of
our first item, right? She is. Yes. How could I resist? She put together this wonderful resource
for people. So what I'm talking about is the ultimate guide to
memorable tech talks. And it is, she said it was going to be a blog post and it turned into what
she calls a book, but it's a seven articles on Medium. And it's a seven part series that she
covers things like choosing a topic, writing a talk proposal, tools that she uses, planning,
writing it, writing your talk out, practicing it,
and delivering the talk. And it's a phenomenal resource. I'm not done with it, actually. I've
read up through the tools section, and I'm looking forward to the rest of it. But one of the things,
I mean, it speaks right to me because I got into podcasting partly to get better about this public
speaking thing, and it's difficult. And especially for, I think,
tool nerds and introverts and stuff like that, jumping into the tools and in the little details,
you can lose sight of stuff. So one of the things she noticed, she quote from the article is,
I noticed I'd procrastinate on making the slides look good instead of focusing my time on making
quality content. It's easy to get sucked into that, yeah.
Yeah, totally.
When I should have been, like, I've given two talks
and many, many talks informally at work and stuff,
but two conferences.
And I spent so much time just tweaking the different slides
instead of practicing it
and making sure I understand it well enough
to even throw the slides away and just do the talk.
And so I think focusing on planning and leaving all that time enough to get all that.
This is actually the series that I wish I had when I started getting into trying to do talks.
Because either you have out there, there's either stuff that doesn't have enough information.
It's just a quick gloss over.
Or it focuses on one aspect or
on the other end you've got courses and books this is just a series of blog posts that you can
you know staple together with one staple so yeah it's good not the big heavy super duper like
binder staple but like a regular one right like the red one from office space yeah yeah just one
of those yeah now this is a great series, and nice work, Nina.
Good pick, Brian.
I think it'll help a lot of people.
I like that it's writing,
you know, she covers writing the talk proposal
because it's great to get into public speaking,
but you have to get accepted
to get into public speaking a lot of times.
Although, although, at the bigger conferences,
say PyCon, you can always do an open space,
and that's kind of like a dip your toe in the water
because you don't have to lead the whole conversation
but you kind of are sort of leading it
in spirit. Yeah, and then you can go through
all of these steps for
just to get ready and you only have to
be up there for, I don't know how long the lightning talks
are, but they're not long. Yeah, for sure.
Yeah, lightning talks are five minutes. Open
spaces I think are 25, but you don't
do the talking the whole time. We're definitely
going to, you know, as you spoke about the live recording. We're definitely going to, as you spoke about the live recording,
we're definitely going to do more live recordings at the main PyCon
as some open spaces, if nothing else.
Yeah, definitely.
I wanted to interview a bunch of people for stuff.
Bringing the mics.
Okay, super.
So the next one is something I've been digging into lately.
Have you done anything with Kubernetes?
No.
Let me take a step back.
Have you done anything with Docker?
No.
So these two things are sort of stacked on top of each other,
and they often get complicated.
So Docker is a little bit of a complicated thing
if you haven't done a lot of deployment and DevOps-type stuff
because it's all about configuring Linux-isolated machines, if you will.
And then, in order to actually use them properly, really,
with scale-out and failover and clusters and connections between them
and all that, like a
Docker container for your web app, one for the database, you really need some kind of orchestration,
which is Kubernetes. So there's a ton to learn about all this stuff. Luckily, Michael Herman
from testdriven.io, formerly of RealPython, I did a really nice write up here. He has a course on testdriven.io. And this is like an extracted, long example tutorial to get started with Flask on Kubernetes.
Wow, that actually sounds neat.
Yeah, it's actually pretty approachable.
I mean, I said it's a complicated topic.
He really lays out the core ideas.
And if you ask Pocket, it says reading time is 16 minutes.
So it's pretty hefty, but it's not book-sized.
It's not super, super long.
Of course, the little steps that it says take more than just one line or whatever.
You've got to let it configure a cluster or something.
So there's little wait periods, but really nice.
It talks about how to basically get a Vue.js front-end, Flask back-end app that talks to a database, Postgres in particular, up and running on a Kubernetes cluster.
So kind of microservice style, which is pretty cool.
And I'll just run off some of the goals so you guys kind of know what you'll get out of it.
So explain what a container and container orchestration is,
pros and cons of using Kubernetes,
over, say, things like Docker Swarm,
all the primitive concepts of Kubernetes, node, pod, service, deployment, etc.
Spinning up a Python app, which is nice.
Locally with Docker Compose,
there's a nice little utility called Minikube,
which lets you
basically in just a couple of command
lines create a
Kubernetes cluster pre-configured on
your machine, which is a super pain.
So that's really nice that you can just
install that thing and have it go.
Stuff like that is really quite nice. Oh, I that thing and have it go and stuff like that.
It's really quite nice.
Oh, I'm definitely going to check this out.
Plus, like in 16 minutes, I can now put like 75 new keywords on my resume.
Oh, yeah.
This is dense in buzzword bingo hits.
Yeah.
Nice.
You and I both picked something that's a little bit on the controversial side of the world and open source and the community, you go first. Okay. So this is just, I think it's still playing out, but
in the end of January, maybe they announced it earlier, but on the Travis CI blog, they announced
that Travis CI is to join Adara, Adara, I don't know. Basically, they got bought by a company.
And frankly, I don't really care who owns what.
But then in February, I started seeing the hashtag TravisAlums on Twitter.
And they looks like, I mean, from the outside looking in, it looks like they're laying off a bunch of engineers.
I don't really know what's going on because they're not really talking about it. There hasn't been anything else on the GitHub or the Travis CI blog to say what's going on.
So I've used Travis for running tests on Python projects. So I wanted to, I thought, you know,
I've wanted to do this anyway, so now's a good time to start looking at if I've got source code
on GitHub, I want to use the, and I want to use testing somewhere, and it isn't Travis, where would I do it?
And so right now I'm trying to use GitLab and Azure pipelines with the help of some
other people to try to get those running on a couple of projects.
But there's a lot more.
I mean, GitHub lists 17 different options for continuous integration that you can hook
up.
Yeah, that's a big deal, right? Like Travis has been kind of the go-to,
plug it into your open source public GitHub repo
and just let it turn away and do all the checks on PRs
and things like that, right?
They claim that our experience is not going to be different.
However, with less engineers,
it seems like it might stagnate and stuff.
Yeah, something that you also talked about here is Azure pipelines, Azure build pipelines,
and that definitely seems like it's getting a lot of traction.
I've heard of some major open source projects moving over to that, either partially or entirely.
I'm dropping a few links in here.
One of the links is a couple things from Anthony Shaw.
Of course, it wouldn't be a show without Anthony. One of them is an article that he writes about Azure pipelines with Python, by example.
And also he's got a PyTest plugin to help with testing on Azure.
And then one of the people that's been helping me out is, I'm going to get his name wrong,
but Anthony Sotile, Sotile?
Sorry, Anthony.
But he's got a whole bunch of different
Azure templates on his GitHub
repo that look neat.
Yeah, cool. And as far as
I understand it, Azure Pipelines are free
for public repos and things like that.
So it's interesting. All the things
I'm going to try are things that have
reasonable free
levels for open source projects. That's kind of an interesting trend that's been going on. There's a try are things that have reasonable free levels for open source projects.
That's kind of an interesting trend that's been going on.
There's a lot of things that are free for public open source repos, but paid for private
stuff.
And that seemed like a pretty good balance to me.
Having the GitHub go to allow private repos is interesting.
And I'm not sure what drove the Travis change, but yeah, anyway.
Yeah, I'm pretty sure GitHub was pressured by Bitbucket,
but I don't know anything about the roots of the Travis one.
I do have more good news around DigitalOcean for you
and all the listeners.
So they've traditionally had these high compute instances
called droplets.
They're virtual machines.
And they've had memory heavy ones but the memory
heavy ones didn't have dedicated cpus like you did it shared with uh the vm host right so it
couldn't be guaranteed of like consistent workloads on that thing so now they're announcing a general
purpose droplet that is a blend of dedicated cs and a wide range of memory configurations.
So basically you can get quite a bit of RAM.
I think the lowest one is 8 gigs and it goes up to either 64 or 128 gigs of RAM
and a whole bunch of dedicated CPUs.
So it's pretty cool and they talk about
some good examples of using it would be
like web applications,
hosting like an e-commerce site where latency matters
or a medium-sized
relational postgres or no sql like mongodb database because you want your database to be fast right
like that's the heart of your app even if you scale the web front end and other sorts of analytics
and so on so you can check that out it's under limited availability but you can go and like
click a button say i want to try this so check them out at pythonbytes.fm slash digitalocean.
It really supports the show.
It keeps us going each week.
Yes.
Very cool.
Yeah, thanks.
So speaking of infrastructure,
it's cool to be able to set up your code
and your websites and all of your infrastructure
to run on Linux.
And you might even do that on Docker
and you might even put those Docker containers on Kubernetes.
But a lot of us are not using Linux as our dev machine, right?
A lot of us are using Macs, at least the ones you'll see walking around the conferences.
So this next item is called a Python server setup for macOS.
That's a cool little Apple emoji.
Nice.
Yeah.
So if you want to run like your Nginx and your MicroWhiskey or your G-Unicorn or like
your production software stack,
but you want to run it locally on macOS,
here's a little guide on how to do that.
So it basically takes you through setting up Nginx,
having Nginx serve your static assets.
They're using G-Unicorn and Flask,
but you could pretty easily swap that out for MicroWhiskey
and whatever else you want to use.
And then having Nginx be the front-end proxy
for a scaled-out G-Unicorn backend.
So it's pretty cool.
Now, I went through this recently,
and not all the commands seem to be working,
at least on my system.
I don't know why, what they're going through.
Maybe I missed a step.
But even if the commands don't all exactly work,
I think it still is a pretty good guide,
and it's on GitHub,
so if you find mistakes, open an issue. It's not super popular, but I think a lot is a pretty good guide and it's on GitHub. So if you find mistakes,
open an issue. It's not super popular, but I think a lot of people will find it helpful if
they're trying to develop in something closer to their actual configuration.
Okay. Is this more for a development environment sort of thing?
Yeah, exactly. Because normally what you do is you get something like Waitress or you get some other built-in Wimpy, non-production, full Python-based web server.
And then all the requests go through your Flask or Pyramid or whatever.
But in production, it doesn't work that way.
You have Nginx doing SSL and doing static assets, and then it only passes the Python requests over.
There's quite a bit of difference.
So if you want to have sort of a QA local thing on your Mac,
here you go.
Oh, that's nice.
Okay, well, I'll check this out.
What do you got for the next one?
When I started using command line interface stuff,
I used to click.
So I did a quick search for different ways
to build a command line interface.
Meaning, like if I, not an interactive thing,
although I know that those are a thing, but just if I want to,
if I've got a script that I'm writing and I want to take values in,
it's parsing the arguments passed into the script or whatever.
Click is great. I love it, but it's something extra.
It's not built into Python yet. It's an extra dependency.
That's fine for a lot of times,
but sometimes you just want to like a couple different parameters that somebody could pass in and so the built-in
arg parse is part of the standard library and that would be good to use yeah it's cool and it's better
than just going to you know sys.argv and going after an array of whatever right you get a little
more help than that right i actually didn't realize that arg parse was so easy so there's um
jeff hale put together an article called learn enough python to be useful arg parse help than that. Right. I actually didn't realize that argparse was so easy. So there's Jeff Hale
put together an article called Learn Enough Python to Be Useful Argparse. That's a kind of long title,
but it's a tutorial on argparse. And it's that I couldn't find a good intro guide for argparse
when I needed one. So I wrote this article. And I got to say, it's a very nice, quick introduction.
So if you want to throw some arguments on an application, try this.
It definitely seems like a good idea.
There's certainly times where I'm like, you know, I just don't feel like going to all the formality.
I just need to know, basically, did they pass some simple thing here?
Yes or no.
Right.
And this is cool, especially for my little proof of concept ideas.
Yeah.
Yeah.
So it comes built in, which is, yeah, really nice.
Nothing to install.
So you don't want to necessarily take on click
and make people go through virtual environments
and pip and all that.
If literally this is the only,
that would be the only thing, right?
So here's another good chance to not do that.
Oh, yeah, that's a great idea.
As if you don't already have any other dependencies
than, yeah, using this.
Exactly, because that'll definitely increase
the challenge of using your code. All right, so you got to go first with Travis CI. I have another
one that's a little bit of a sticky situation here, but I think it's really interesting to
dig into. Okay. This is actually from January 14th, actually. Let me look at the AWS blog and
see when they published it. January 9th. So this is something I've been
wanting to talk about for a little while, but it's involved. There's a lot of moving parts
and a lot of layers to this. So I didn't want to just give it like a super peripheral sort of
skim the surface type of thing. So here's the deal. I'm talking about an article by a guy named
Ben Thompson who runs Stratechery, which is a really great resource for understanding the
business side of software.
So super, super interesting work that he's doing there.
And this one's called AWS, MongoDB,
and the Economic Realities of Open Source.
Yeah.
And he also runs a podcast called Exponent.
And Exponent is sort of the audio side.
He does this with another guy named James.
And I'm linking to an hour-long episode he did on this called Inverted Pyramids, which talks about demand and building platforms
and things like that. So really interesting interplay between open source, cloud computing,
and so on. But it focused on this disagreement at a business level between MongoDB and AWS.
Okay, so MongoDB, they make databases, right?
They make a document database,
and they have kind of a unique API,
and they are definitely the most popular
of the document databases.
Regardless of whether you like MongoDB or not,
certainly they are well-known and well-used in that space, right?
In the DocumentDB space.
If you're not doing that,
you're probably doing Postgres in the Python world. But AWS wanted to have a MongoDB as a service type of offering,
like they have RDS and other managed database options, right? They want to have basically a
MongoDB compatible one. But MongoDB, the company, their business model is basically three things. They sell an enterprise version of the on-premise software.
That's kind of unrelated to this.
They have a free community version.
And then they sell this thing called Atlas DB, which is MongoDB as a service.
And what you do is you go to MongoDB Atlas and you say, I'd like to run this on AWS,
or I'd like to run it on Azure or on Google's cloud.
So that's MongoDB's business model. AWS just wants just to have a service. So MongoDB,
who owns the IP to like the licensing of MongoDB, the server, changed it to something called
AGPL. Have you heard of AGPL? Well, I think, but I don't remember what it is.
You've heard of GPL, right? Basically, if you use this software in yours, then you must make yours open source.
That's true when you directly depend on it.
But what if it's hosted and managed by Amazon and you just talk to it over the network?
You're not interacting with it.
You're not using it, right?
Well, AGPL is like GPL plus network.
So if you access this software over the network,
it's like the GPL applies to you.
Oh, weird.
Isn't that weird?
So basically what that means is
AWS cannot have a hosted MongoDB
because if they do, it triggers this AGPL
and like AWS becomes open source
or something crazy like this, right?
Like the consequences are too high
and like Apple's banned it from the app store.
Google has banned it from all the AG, any AGPL software basically is not allowed within Google. It's really not something these cloud companies are interested
in having interactions with, right? Because it takes the GPL bit and doesn't talk about shipping
stuff, but accessing over the network. So that brings us to the interesting part. So what does AWS do? Not
just go, okay, great. We keep running Atlas. No, AWS says today, they said this on January 9th.
Today, we are launching Amazon DocumentDB with MongoDB compatibility, a fast, scalable,
and highly available database that's designed to be compatible with your existing MongoDB
applications and tools. It's purpose-built.
It has all this replication, et cetera, et cetera.
And it's all for your production scale MongoDB workloads.
So they took the exact API of MongoDB and rebuilt a brand new server and service from it, from scratch.
Okay, interesting.
They didn't do it on the latest one
because the latest one has the AGPL.
So they went back to 3.6.
The latest one is like 4.05 or something like that.
So they went back to the latest one, the newest one that doesn't have this restriction.
They mimicked that, which should be good enough actually for most people, I would guess.
Yeah.
Weird.
Is that interesting?
You change your license so we can't have it, so we literally copy your API byte for byte on the wire identical and rebuild a new service from scratch on it.
Okay.
Both sides are kind of being sneaky and whatever.
Yes.
I feel a little more kinship towards the MongoDB Inc. side
because they developed this whole thing from scratch
and they built it up and they got it popular.
But yeah, it's definitely an interesting tit and tat.
But what is the AGPL to all of Mongo then?
What if I build my website?
Yeah, the community edition is separate for like your self-hosting somehow.
But there's like clauses about accessing it over the network.
Basically, it's, you know, very much like the GDPR is like mostly built to fight Google and Facebook.
Right?
This, my understanding is like basically this is to fight the cloud providers.
Not meant to interact with regular people just running the community version. Okay. and Facebook, right? This, my understanding is like, basically this is to fight the cloud providers,
not meant to interact with regular people just running the community version.
Okay, but regular people are running on clouds too.
I know, it's quite interesting.
I just would not want to pick a fight
with one of these big players, but whatever.
Yeah, so I encourage people to both,
maybe first listen to the podcast
and then read the article if you want to go all in,
but you can skim the article. You can't really skim the podcast. I know the article if you want to go all in.
But you can skim the article.
You can't really skim the podcast.
I know some people do like 2.5x,
but those people are crazy.
It melts my brain.
I can't do it.
I do 1.3.
Yeah, that's a pretty good balance.
There's layer upon layer.
Maybe I'll come to the conclusion real quick. So it says,
thus we have arrived at a conundrum
for open source companies.
MongoDB leveraged open source
to gain mindshare, right?
Started as open source.
MongoDB Inc. built a successful company
selling additional tools for enterprises to run MongoDB.
But more and more enterprises
don't want to run software easier or not.
They want to hire AWS or Microsoft or Google
or some other cloud provider to run it for them
because they value performance and scalability and all that.
So it leaves MongoDB pretty much like
a little bit outside the value chain,
which is interesting.
And there's just a lot of interesting trade-offs
for open source VC-funded companies
and traditional monetization strategies
are looking harder and harder
in the face of cloud computing companies.
They just go, that's a cool API, we'll do that.
I told you, lots of layers.
I don't know, it takes a lot of pondering.
I really don't know where I am.
But I think it's a big deal, honestly.
It is a big deal.
For the rest of us, the topic is important.
Document databases are kind of amazing,
and I think that we need to push those forward.
And having legal structures trying to get in the way
of making this just
better for everybody.
I know that everybody that started it should be able to make money, but also it needs to
move forward.
So yeah, there's lots of sides to this.
There are a lot of angles and a lot of sides, and Ben does a super job covering it.
So anyway, link to that if that sounds interesting.
Check it out.
It's not exactly the newest of news, but I think it's big news still in the open source world.
Okay.
All right.
You got some extra stuff for us this week?
Just that I'm so terrible with names.
But the people that do the Teaching Python podcast, I just released an episode of Testing Code where I interviewed them.
Yeah, I saw that come out.
That's great.
Yeah.
So that was cool.
Do you have any extras?
I have a couple of extras,
just mostly follow-ups and some news.
So the folks running Pi Texas in Austin,
which sounds like a really fun place to be
if you couldn't come to Pi Cascades in Seattle,
Pi Texas 2019 is going to be April 13th and 14th.
The registration is open
and I'm linking to their page
to see if it looks pretty cool. That's one. The other is the article we covered from Anthony Shaw last week. 13th and 14th registration is open and I'm linking to their page and stuff. It looks,
looks pretty cool. That's one. The other is the article we covered from Anthony Shaw last week, apparently it was like two years old and somebody had sent it to us. So it felt like, oh yeah,
here's a new article from Anthony. So we covered it as if, without any caveats. So Anthony Shaw,
sorry, we dug up your old, your very far past predictions of the future and then presented them from today,
but they're still good. I thought it was quite a good article and it stands the test of time
pretty well. And finally, remember when I talked about Rust Python and I said, the reason this
makes me super, super happy, Brian, is that it enables this WebAssembly future of Python,
because Rust has a very strong WebAssembly story. Well, on that episode page, one of the core developers
posted a comment, said, thanks for covering this.
By the way, rustpython.github.io slash demo.
Yeah, that's it running in the web.
The CPython built on Rust.
Well, Rust Python running the web under WebAssembly.
So if you open that bad boy up,
and boom, it's up and running.
You can go and type stuff. There's not much of a
standard library. That's something we already covered
previously, so it's kind of mostly
a language level thing. But here
it is. Two megabytes
WebAssembly binaries running in your browser.
It's pretty cool. That is really
pretty cool. Yeah.
Alright, so that's all of it for
my extra stuff as well okay you got a joke for me people are enjoying this joke segment i'm enjoying
the joke segment i do not have one okay i got two i'll see what you think of these okay okay why was
the developer unhappy at their job i don't know why they wanted arrays a rr ays arrays that's
pretty bad that's lists for all the Python people.
Yes.
Where did the parallel function wash its hands?
I'm not sure where.
In async. A-S.
Was there a line or did it have to wait?
That's pretty good.
Yeah, these are bad. These are pretty bad, but here they are.
No, it's good.
I have kids, so I'll retell these. these they won't understand it but that's all right you
know but they'll be in like the style in which they should appreciate they just the jokes just
won't make sense yeah anyway awesome all right well uh welcome back and uh thanks for doing the
show with me today yeah thank you you bet thank you for listening to python bites follow the show
on twitter via at Python Bytes.
That's Python Bytes as in B-Y-T-E-S.
And get the full show notes at PythonBytes.fm.
If you have a news item you want featured,
just visit PythonBytes.fm and send it our way.
We're always on the lookout for sharing something cool. On behalf of myself and Brian Ocken,
this is Michael Kennedy.
Thank you for listening and sharing this podcast
with your friends and colleagues.