The Changelog: Software Development, Open Source - Twisted and Evented Programming in Python (Interview)
Episode Date: May 3, 2011Kenneth and Wynn caught up with Glyph Lefkowitz from Twisted to talk about the project and evented programming in Python....
Transcript
Discussion (0)
Welcome to the Changelog episode 0.5.8. I'm Adam Stachowiak.
And I'm Winn Netherland. This is the Changelog. We cover what's fresh and new in the open source world.
If you found us on iTunes, we're also on the web at thechangelog.com.
We're also on GitHub.
Head to github.com slash explore.
You'll find some trending repos, some feature repos from this blog, as well as our audio podcasts.
If you're on Twitter, follow The Change Log Show.
I'm me, Adam Stack.
And I'm Penguin, P-E-N-G-W-I-N-N.
Fun episode this week.
Talked to Glyph over at Twisted, the granddaddy of all evented, non-blocking,
I don't want to call it a web framework,
but a toolkit, as it were.
Big in the Python community,
so chock full of white space.
So we like Hamil and Sass and white space, right?
Yeah.
Get a little of that.
Cheers.
Every time that we say Hamil and Sass from now on,
Adam's going to mix in the clinking sound,
so bottoms up.
There you go.
So no jobs to read this week.
We do have some developments
around the sponsorship front
that we're just so excited about
that we will share
in coming weeks.
Yeah, it's a really exciting thing
actually for somebody else
as well as us.
As well as us.
Kenneth joined last minute
and helped us out with this Twisted episode.
I think you're going to enjoy it. Talked about non-blocking frameworks in general, but also
kind of the world of Python and really a rich history of this Twisted project, 10 years old.
Anybody using Tornado might also listen in as well. Yeah, I went into that and some of the, I guess,
distinctions between Twisted and Tornado.
I think it's wild how it contains a web server,
chat clients, and all this other fun stuff to do all this.
It's pretty wild.
Did you see the success pages?
It powers HipChat, which we use every day,
and also powers TweetDeck, which we're big fans of,
some other high-profile sites. But he probably gets the most play out of the fact that Lucasfilm is using it.
There you go.
Fun episode. Should we get to it?
Let's do it.
Chatting today with Cliff Lefkowitz from Twisted, Twisted Matrix Labs.
So, Glyph, why don't you give a quick introduction to who you are and a bit about Twisted.
Well, I'm the original founder of the Twisted project.
I write lots of code in Python, pretty much all of it open source.
And Twisted is an event-driven framework in Python.
It's a networking engine and has tons of utilities
for doing event-driven programming of various kinds,
lots and lots of protocol implementations.
I could do the whole list, but then we'd be here all day.
So, yeah, not too much to tell.
If you know what it is, you probably have used it
because it's one of the few things of its kind.
A little bit like Event Machine or Node, I guess.
Recently, there are more popular hip versions of this thing.
But Twisted's been around for 10 years.
So we were not the first, but a pretty early one.
What was the impetus for starting Twisted?
Oh, I love that question.
I actually gave a talk about that question recently.
The impetus for starting Twisted was
I was making a video game.
The story actually starts when I was eight,
but the first ten years or so are not very interesting.
Eventually, so I was writing this video game in Java,
and I eventually sort of reached the limits of the Java virtual machine,
especially because this was around 2000,
when there wasn't such a thing as NIO yet,
so I had thread per connection.
In fact, four threads per connection,
reader, writer, exceptions, and logic.
And as you might imagine, that got to be a big mess pretty fast.
So I rewrote it in Python, still a thread per connection thing.
And then I discovered the select module.
I was just kind of going through learning each module in the standard library,
and I had no idea what select meant.
So I read the documentation, and it didn't make any sense to me.
So I did a little prototype and then thought,
wow, this is like clearly the best way to do IO.
So it's so much less confusing.
Everything just happens in order.
And as I was implementing a game,
I really wanted things to happen in that kind of way because I wanted to do a simulation loop
that had discrete phases and discrete ticks and every
action that a user took was a discrete object. So I was like, great, I'm going to do this. I'm
going to switch everything over to using select. Now where are all of the network protocol
implementations? Because there's got to be a bunch of them, right? And I found, you know, Python has all these great libraries,
URL lib and HTTP lib and MIME lib and IMAP lib,
but none of them work in an asynchronous context.
There was AsyncCore at the time,
but AsyncCore didn't actually do anything.
It just let you write your own stuff.
So I wrote my own event-driven core,
and I decided that it should be the one true way to write network protocols in an asynchronous way,
so that there'd be a common API that people could get a bunch of different protocol implementations
and have them all in the same process. Because I wanted to make my game accessible to web browsers,
which were at the time this new hip thing.
But all the other network clients that were popular then, Telnet clients, and I wanted
it to be able to deliver you email.
I wanted you to be able to check your email on the game server.
And now we actually have all that stuff in Twisted.
There's an IMAP implementation, POP, SMTP, DNS, HTTP, pretty much all the stuff that
I originally wanted to do
in that original game.
And the game is nowhere to be seen, though.
That project has become increasingly researchy,
and it's currently called DivMod Imaginary,
and if you Google around enough,
you can probably find the code for that.
But it's definitely not as mature or interesting
as Twisted itself.
Speaking of Google, you own Twisted on Google.
I was impressed.
Yeah, there's a couple of people that are close.
There's, I think, a humor site that has Twisted in its name somewhere.
But, yeah, we've been relentlessly and shamelessly self-promoting
for a really long time, and so Google likes us.
So you were mentioning all the different asynchronous libraries
that weren't available at the time for HTTP.
Do you have any thoughts on gevent and eventlet
and how those are, you know, their relationship with Twisted?
Sure. Well, so first of all, eventlet is great.
I love it when people bring up eventlet
because they so frequently... I hear it from some programmer
who used to use Twisted and is now using Eventlet
and they sheepishly admit the betrayal.
But Eventlet actually, the default hub for Eventlet
uses Twisted for network IO.
And that's pretty much exactly the situation we want to be in.
It's just the default choice for network IO.
And then Eventlet presents this API
that's different than what Twisted would natively present,
but you can still use all of the Twisted protocols,
presuming that you use that hub
and you don't switch to one of the other Eventlet hubs,
which I don't really understand the point
of some of the other Eventlet hubs
because one of the big things
that they tell you in their documentation
about which hub you should use is,
well, you could use the Twisted hub
or you could use the EPOL hub
because it's more scalable.
But actually, you can just use Twisted's EPOL support
and it's equally scalable.
So I think that there might be a communication issue there
that we might need to talk to their developers more often.
And gevent is like eventlet,
except it's got its own network I.O.
and is totally incompatible with Twisted,
so it's not very interesting to me.
It kind of seems like just a step down
from what eventlet offers.
I realize that it's a little bit simpler, smaller.
But things like geventEvent and Eventlet
present this API,
which is sort of semi-synchronous.
The code that you write in Eventlet or G-Event
is more or less indistinguishable
from the code you would write
if you were just writing a multi-threaded server.
You just write a protocol implementation that blocks,
and then transparently
in the background, it's made asynchronous, but you have to do all the same things that you would do.
You have to write synchronization logic. You have to make sure that you don't accidentally
context switch in the wrong place. So for certain types of applications, and to my mind for most
applications, but obviously my taste is a little bit biased here,
I think that for most applications,
something like Twisted is actually simpler because you don't have to kind of unravel the threads in your head
and go and inspect and make sure that nothing you're calling
eventually calls a socket function
because that'll cause a context switch that you might not be expecting.
With Twisted, it's all very straightforward.
You don't context switch until you return.
And so, it's very easy to figure out
when you're returning.
I've been wanting to get into Twisted for a long time. I just
haven't found the excuse to.
So,
I just wanted to point that out.
Nothing there, sorry.
Deep thoughts by Kenneth.
So, is HTTP
the, I guess, the primary protocol that people are using when they're using Twisted?
Oh, well, of course.
But, I mean, that's just because HTTP is the primary protocol that people are using when they're using the Internet.
That's Twisted.web, right?
Yes, that's Twisted.web.
And people who use Twisted
do tend to use HTTP and then something else.
I mean, there's obviously a lot of users
who will just use HTTP and write a web app,
especially in these heady days of Comet and WebSockets
where HTTP is an increasingly expanding thing
that actually is event-driven and two-way.
But HTTP in combination with DNS or with an email protocol
is a very common sort of thing people will do with Twisted.
Was that the primary protocol you had out of the box,
and how soon did the other protocols trail?
Actually, HTTP was not first at all.
I think it might have been third or fourth.
I can't remember if NNTP beat it out.
The idea was originally Twisted's main protocol
was really just a custom remote object access protocol
because there was a sort of desktop client for the game.
That protocol eventually became what is now Perspective Broker, which is in the Twisted.spread
package.
And it's Twisted.spread.pb.
And the idea was you wanted to just publish your objects for access over a network.
So PB was the native protocol of Twisted. And then all the other things were these kind of degenerate things like, oh,
well, okay, maybe you want to use a web browser, but that's not as good. Obviously, the PB
applications marketplace has not taken off to quite the degree that we expected, so HTTP has
become a much bigger part of Twisted's life. But that same idiom kind of pervades still, which is that every protocol
is just about publishing your objects in the network somehow. So HTTP is a little more popular,
but it's not really, it doesn't occupy a special position in Twisted's hierarchy.
And especially because given that Twisted is not a web framework, people often come to it
expecting something like Ruby on Rails,
but it's really nothing like that at all.
It's a lower level thing that's designed
that you would build something like a web framework on top of.
It's because it's not a web framework,
people who come to it and expect a web framework
are often disappointed and leave.
People who come to it expecting a toolkit
to do these kind of multi-protocol things
are very happy,
and that's what our community is largely made up of.
So you mentioned it's 10 years old,
and I know that we haven't had Mac Intel that long,
and you're on a Mac now,
so I'm assuming you weren't on a Mac
when this project started.
So you come from a Linux or a Windows Python background?
I'm definitely a polyglot.
I like to use every platform.
I think probably my history is primarily Linux.
A lot of Twista's development was on Linux.
But actually, I believe at the time that I started the project,
I was using a public beta of Mac OS X server.
That's kind of what I remember having on my workstation at the time.
But yeah, I have lots of Linux servers.
I have several Macs.
I still play video games, so I run Windows on occasion.
And Mac's actually part of Twisted's philosophy as well.
It's partially
just a random comment on my own desire to
be portable
and be able to work anywhere, but also
it's...
We want you to be
able to write asynchronous code for Twisted.
The original vision was to be able to
write an asynchronous protocol
implementation and say, there, I've done it.
It's an asynchronous HTTP client.
You can use it from any Twisted program.
And part of the appeal there is you want to be able to write that on the server and the client, potentially.
So we have reactors for GUI toolkits of various kinds.
So it runs native in the Mac GUI.
It can run native in a Windows GUI. It can run in GTK on
Linux or in Qt, which I think works cross-platform. Yes. Well, obviously Qt does. I'm trying to
remember how long it actually does. But yeah, so you can use Twisted pretty much wherever.
And we try, I think there's a definite bias,
especially after many years of trying to fight
with the network stack on Windows,
that we definitely have a Unix-y bias
and do not like Windows very much.
But we try not to let that show through too much.
We do have support for IO completion ports,
which is a Windows-specific asynchronous networking API.
So are there any uses for Twisted other than network-driven programming at all?
Well, all programming these days is network programming, really.
The network is the computer.
Yeah.
It's sad that even though those guys were right,
they didn't really get to benefit too much from being right.
So Twisted has a lot of utilities like, for example, deferreds.
A deferred is a result that doesn't exist yet.
That is very useful in a network context
where you're going to make a call across
the network and the result might come back or might not. It's also useful in a GUI context,
though, because it lets you pop up a dialog and continue processing. You can have a function which
asks a user a question. And of course, if you're writing a web app, that's what you're doing half
the time anyway, right? You're just asking the user a question and waiting for their response to come back, and you just make it deferred, and then you can pop up five dialog boxes.
The user can answer them in any order,
and you can continue processing in the background
while they're doing that.
So there's that.
There's also some timing utilities.
One of my favorites is twisted.internet.task.loopingcall,
which was originally designed for a voice over IP application. So a
networking application, uh, in order to do real time, every 10 milliseconds, sending out a,
an audio sample. Um, so you can tell people about that the next time you hear that Python's not fast
enough for something. It can do real-time network audio processing.
And so that was originally what it was for,
but then we realized that we really wanted to have it be able to compensate for falling behind.
So the idea is it obviously can't be hard real-time
because even just, forget about Python,
even just being a user space process like not a kernel
process on a unix operating system you can't really do hard real time so since it's soft real
time we want to make it so you you get called every 10 milliseconds really reliably but then
if some other thing is processing and delaying the main loop from executing your call again
you get notified okay this is an exact multiple of 10 milliseconds that
you're being invoked at, but you've skipped six calls. So you've dropped six frames. And that's
useful for animation. You can use it in a pie game game, which I've seen done. These are obviously
less popular uses for Twisted because it is a pretty big piece of code that contains lots and lots of network protocol implementation.
So most people who just discover it in the first place come to it because of the networking stuff that's in it.
But lots of people who learn to work with the event abstractions in Twisted and write other kinds of programs do use it for other stuff.
So Twisted is extremely performant compared to a lot of other options out there.
Can you get a little bit into the whole controversy behind when Tornado came out and how they built a whole – from my understanding is they built a whole framework that was unnecessary because the Twisted.web was already there, right?
But it didn't have a web framework built on top of it?
I suppose it depends who you ask as to whether it was necessary or not.
I think the most definitive answer to this was a couple of – so I did a big angry blog post when Tornado came out.
Largely because – not because they wrote it because people write stuff all the time.
It was more that the way they announced it was very strange and –
Revolutionary.
Well, it included a comment about Twisted, which was misleading and wrong and kind of weirdly passive-aggressive.
It said something about Twisted not meeting their requirements.
And I had never heard from them about their requirements.
I had no idea what their requirements were.
They didn't say what their requirements were.
They kind of implied that it wasn't fast enough,
but they didn't give any performance numbers or anything.
So I was a little miffed that they were bad-mouthing Twisted
in this way that made it impossible to respond and say,
no, it's not slow, no, it's not whatever.
Or even just to constructively respond and say,
oh, wow, it is slow. We should really fix that.
So Twisted, and I don't like to harp on the performance thing too much
because performance is an extremely complex question,
and especially when you get into a system like Twisted,
which allows you to integrate lots of different protocol implementations,
lots of different event sources all firing in the same main loop,
all sharing resources.
What you're doing depends very, very heavily
on how it's going to perform.
Or, well, I'm sorry.
How it's going to perform depends very, very heavily
on what exactly you're doing.
And one of the things that we have to counsel people
over and over again in various
support forums for Twisted is
write your code, run your code,
run a profiler,
see what the hotspots are.
Because people get very excited
about micro-benchmarks.
And then
they focus
to the exclusion of actually useful stuff,
especially like performance under scalability.
Like if you have a framework that can do 20 connections really, really fast
and process lots of responses and requests over those 20 connections,
but then when you scale it up to 1,000 it falls over,
is that better or worse than a framework that doesn't do those 20 connections terribly fast but maintains that performance on a totally consistent ramp
up to however many connections you want?
Personally, I tend to go for things that are the latter,
but regardless of what type of performance you're looking for,
your application will always be 10 times as much CPU than Twisted.
So when you write a Twisted app,
you will typically spend your time
optimizing things outside of Twisted.
And I know this because whenever people
start talking about performance,
I really, really want people to contribute
performance patches to help us make Twisted faster.
You can check out Twisted's performance
on speed.twistedmatrix.com.
But it's fast enough for so many things
that we get very few contributions in the performance area
because people come to it,
they spend a month complaining about performance
and doing these tiny little benchmarks
and trying to figure out if Twisted's going to be good enough.
And then they decide to use it.
And then it turns out that actually that was a huge waste of time
and all those benchmarks they were doing
are not actually measuring their app at all. And when they go to optimize, it turns out that actually that was a huge waste of time, and all those benchmarks they were doing are not actually measuring their app at all.
And when they go to optimize, it turns out,
oh, well, Postgres is 99% of the performance bottleneck.
We don't even notice Twisted.
It doesn't show up in any of our profiles.
So that's a typical performance story.
And of course, there are stories where, for example,
if you're doing voice over IP,
and you're trying to multiplex a thousand real-time audio streams,
then you start to notice the low-level networking stuff cropping up.
As far as Tornado specifically,
it seems to perform kind of to within an epsilon of Twisted.
There are a couple things it doesn't do.
It's a little bit faster,
so it's not really clear that
there's a huge win on one side or the other
but the most definitive argument
in the whole Tornado thing is after I wrote that
big angry blog post
a Twisted user came along
and just
wrote a patch, I think it was a fork
on GitHub if I recall correctly
that's where they're hosted
and he just took out all of the networking stuff from Tornado and replaced it with Twisted.
And the web framework API remained exactly identical.
And it was a patch that deleted like 8,000 lines or something.
And Tornado was functionally equivalent on top of Twisted, unless you were writing a
hook into their IO loop. So quite the success stories
on the Twisted website,
TweetDeck, JustinTV, HipChat,
which I use every day. Any of these
are you more proud of that
you're able to
enable someone else's success?
Well,
I'm proud of
all of them. I'm happy whenever anyone...
Well, it's got to be a geek's dream to power Lucas in some way, right?
Oh, yeah.
No, I guess if I had to pick one, it would have to be Lucasfilm.
I got a Christmas card from them once, and I was like, I have arrived.
Because, yeah, and they were really great and super gracious. The folks at Lucasfilm who actually did that and got us the success story,
obviously they're a big company and it's difficult to get something like that
out past the corporate communications people.
And I really appreciate that Dave Petikolas,
who's the guy listed there in the success story,
really worked to get us that success story and to get it on our website.
So if I had to choose one, that would be it.
But there are so many projects that have used it in some way.
And proud isn't even the right emotion in a way.
It's honored.
I'm honored that they chose the technology
that me and the Twisted team worked on so hard.
It's just a great validation of our efforts.
And there's some on there that if you look on,
we got another wiki page,
Projects Using Twisted.
Success stories are just the ones
where people could actually put together a little narrative
about why they're using it and what was good about it.
But another one that I'm really quite happy about was OpenStack, which was the collaboration
between NASA and Rackspace.
This is the open source web framework, or I guess cloud framework?
Yeah, it's a cloud computing thing.
And to be honest, given that it has the word cloud in the name,
I'm not even really sure what it does.
It's a set of APIs they build on top of the different virtualization layers.
So you can control a VMware stack as well as all the other different ones, I believe.
And it includes some other things and specs on top of that as well.
Storage, compute, the whole nine yards? Yeah, all of that.
It's really the some other
things where I get a little bit fuzzy, but
regardless of what it is, it's used by
NASA to control thousands of computers,
so that is cool. You must be a
hit at Christmas and Thanksgiving when you go home
and, you know, look, Mom, I'm doing
Lucas and NASA.
You know what?
My family is very diverse and eclectic.
My sister is an acoustic physicist.
Wow.
My other sister is a rock star.
It's hard to impress anybody in that family.
In fact, my father, I don't know if you've ever heard him speak,
but he was a keynote speaker at
PyCon and OzCon. So even if you just
restrict my family's achievements to
open source, I'm still kind of
not necessarily
at the top of the heap.
So yeah, but it's great to be
in a family like that.
So I was going to ask, who's your programming hero?
Is your dad in the list?
It might sound a little corny, but yeah. I always thought of my dad as my programming hero,
but I didn't actually know that much about what he did. He worked a lot on systems in the finance
sector. And I was a little kid, so I didn't really understand what he did. But I actually worked at a startup a couple of years ago with one of his coworkers.
And that experience was really interesting because apparently I write code very much like my father.
There are similarities between Twisted and some systems that he worked on.
And the more that I've learned about what his career was like and the kind of stuff that he did, the more that he's my programming hero.
The force is strong with this one.
Yeah, that's a metaphor.
It's a little uncomfortably close to home.
If you've ever met my dad, you know what I mean.
So what's coming up on your open source radar?
What new projects are you excited about?
You know, it's very hard to choose something.
I'm just...
In the world of open source,
I think what I'm really glad about
is that we are experiencing this massive renaissance.
It's hard to get excited about any single project
because every time I want to do something,
I just type a search into my web browser
and there's something that does something like what I want
in the open source space.
And then even in the relatively small niche of Twisted,
there's just tons of libraries
that people are writing every day.
I'm really kind of excited
about the minor renaissance
that Twisted is enjoying, too.
The last couple releases,
we've gotten out on time.
We've gotten new features.
For a while, development was slowing down a bit. We had a lot of bugs to fix for a long time. We've gotten new features. For a while, development was slowing down a bit.
We had a lot of bugs to fix for a long time.
We transitioned from that process,
a process that was a wild west,
kind of commit anything you want to,
everything has to be unit tested,
everything had to be documented,
and for a little while that slowed us down,
but now that we're reaping the benefits of having
done that,
you can actually see on twistedmatrix.com
slash highscores
the
review points that people are
accumulating.
And
if you click on that left arrow,
go back a couple months.
The 8-bit interface.
I'm glad you appreciate it.
The font was the first thing that went into that web app.
I can see.
The avatar is 8-bit.
I'm stuck in an 8-bit world.
So Glyph is one last question.
Is this your real name, or is your name like a symbol like Prince was,
where you just had to shorten it to Glyph to make it pronounceable?
No, Glyph as a handle predated the symbol.
When I started using Unix, I needed a short handle that was easy to type
because I had to type.
As I started off, when I started using Unix,
I was using a Mac OS 8 machine,
and so I needed to type my username all the time
because it wasn't implicit.
It wasn't part of my environment.
So every time I connected somewhere,
I had to type it.
So five letters was shorter than my real name.
But it is not my legal name,
and I don't talk about my legal name because it's kind of a
little in-joke on the open source community. My hypothesis is that nobody really reads licenses
or knows what license things are under. And this is validated by the fact that most people don't
know my real name. But for, for I think seven out of the ten
years that Twisted's been going, it was at the top
of every single file in the Twisted
repository. In the license
statement, it said, copyright, my real name.
So are you like the
Hawaii-Ducky stiff of the Python community?
I'm not going to
randomly disappear from the internet one day.
Asyncretously.
Yes.
He'll promise to be back.
Well, if I did it asynchronously,
I would be the doctor
of the Python community.
But I can only hope to be
as witty and prolific as Y.
Indeed.
Well, thanks so much for joining us,
telling the world and all of our listeners about Twisted.
It's been out there for a while, but definitely good stuff.
And I wanted to get down with this project
just because it seems like every time we talk about Node.js,
it comes back to, oh, that's just like Twisted.
Well, thank you for the opportunity.
And if I might, just one last interjection,
since your co-host mentioned that he had no reason to get into Twisted,
he was looking for a reason to do it.
I would just like to leave your listeners with something that they might do.
Twisted includes a whole bunch of command line utilities
for running all of the servers that it includes.
So if you're the average sort of open source nerd
who runs a personal server,
you can replace all of your personal network infrastructure
with, instead of Bind, you can run twisty DNS.
Instead of Apache, you can run twisty web.
And instead of hybrid, you can run twisty words,
dash dash IRC port.
So you can pretty much any network service
that you're interested in playing around with,
you can start off by just typing one command line.
You don't need to write a whole ton of code to get into it.
Fantastic. Even SSH server, right, with twisty conch?
Yep, twisty conch.
SSH, of course, with the crypto,
you've got to generate some keys and do a little more.
So that's the reason I don't open up with that one.
But yeah, it is a functional replacement for OpenSSH.
It does authorized keys, authentication,
and everything. Now does that work well
as a client as well?
Yes, you can just run conch,
and it's more or less drag and drop,
or sorry, drop-in
replacement for the command line
SSH, except it outputs a couple of log
messages every so often.
So I'm curious, how does that
compare to Paramiko? ssh except it outputs a couple of log messages every so often so so i'm curious how does that uh
compare to paramico uh you can run it in any a client in any twisted server that's the difference
between that and paramico oh because it's 100 python there's no dependencies at all right
well there's there's some c crypto, but the application logic is all Python.
The network I.O. is all twisted.
It doesn't use any special network I.O. stuff.
It just reads the bytes and does some crypto on them.
Sounds good.
Sounds like you claimed Canada's upcoming weekend.
No, no, no.
Excellent.
Next month.
Cool.
Thanks again, Glyph.
We surely appreciate it.
And thanks again for the opportunity.