The Changelog: Software Development, Open Source - The WebSocket protocol (Interview)
Episode Date: August 9, 2010Wynn and Micheil sat down with Peter Griess from Yahoo Mail, Martyn Loughran from Pusher App, and Guillermo Rauch from Socket.IO to talk about Websockets....
Transcript
Discussion (0)
Welcome to the ChangeLog episode 0.3.1. I'm Adam Stachowiak.
And I'm Winn Netherland. This is the ChangeLog. We cover what's fresh and new in the world of open source.
If you found us on iTunes, we're also on the web at thechangelog.com.
Also head to github.com forward slash explore.
You'll find some trendy repos, some feature repos from our blog and the audio podcasts.
If you're on Twitter, follow Change Log Show, not The Change Log.
And I'm Adam Stack.
And I'm Penguin, P-E-N-G-W-I-N-N.
Fun episode this week talking web soets with some experts in the area. Peter over at Yahoo,
Martin from PushYourApp,
and Guillermo from Socket.io,
along with guest host
Michael Smith from way down under again.
Yeah, that's a nice lineup there.
Yeah, Michael's got a Project Node WebSocket server
that is an implementation
of WebSocket server-side, so I guess we should mention
what WebSockets are. Yeah, what is it?
It's a persistent connection
between the browser and the server
so that you can do server push down to the browser
and they can just open that long-running connection
and it's more two-way bidirectional communication
between client and server
without having to do long polling or AJAX techniques
like we currently do.
Right, so the idea is to move away from the AJAX piece of it,
speed it up, and be more native?
Yeah.
Especially as more and more apps are poised to go real-time.
This is just the ever-evolving landscape of web development.
Cool.
Fun episode this week.
Should we get to it?
Let's do it. All right, we're joined today by Peter, Martin, and Guillermo, and Michael Smith to talk about HTML5 WebSockets.
Before we dive right in, let's go around the horn and each of you introduce yourself, kind of what you do in the landscape of WebSockets, where you work, and your role there.
Peter, let's start with you.
So I'm Peter Grice. I work at Yahoo. I'm a principal engineer.
I work on mail. We're looking at using WebSockets for a couple of different things. We don't have
anything in production yet because obviously WebSockets is pretty new, but we're looking at
using it for adding different real-time features to mail, message notifications, other things like
that, and also doing some experiments with using it to accelerate attachment uploads.
All right, Martin?
Hi there. I work at a company called New Bamboo.
We're a Ruby shop in London.
So as well as client projects, we've got a few products that we're working on at the moment.
One of those is called Pusher App.
And the idea of Pusher App is making it really simple for people to be able to push events to
browsers we have an api which you push events to and those are then via a pub sub model sent to
send to browsers we've used this internally on a few of our on a few of our client projects and
and our other products and it's been working really nicely.
I should say WebSockets.
We use WebSockets to implement Pusher app.
Guillermo?
Hello.
My name is Guillermo Rauch. I'm the CTO of LearnBoost, an education startup in San Francisco.
And I created Socket.io, which is uh two projects one is the uh a client that provides uh websocket like api on
on the browser that um basically gives you websocket um and a bunch of our different
transports um in a way in this in a way that like jquery provides dollar aj, and they give you XML HTTP requests for standard compliant browsers,
and ActiveX object for Internet Explorer.
I do the same thing for many different browsers and many different transports.
But the developer thinks it's a WebSocket-like API.
On the server side, I created Socket.io-node,
which is an implementation of all these different types of requests
so that you also develop as if you were receiving data from a socket.
Perfect.
So regular listeners of the ChangeLog will know that we normally cover
projects on the podcast,
but sometimes we take a step back and cover broader topics.
And we'll dive into some of your projects in just a moment.
But, Michael, why don't you give an overview for those that might not know what WebSockets is and why it should get us excited?
Okay, sure.
WebSockets are a new piece of technology that is currently falling under the umbrella term of HTML5.
They're basically a way to get bidirectional communication between your web browser and
a server, and there's no need for constantly opening up new connections or things like
that.
So very fast, very real-time, quite similar to almost having a TCP socket
although there's a bit more to it than that
and currently we've got browser support
in Safari, Chrome, Firefox
4 is coming I've heard and same with
an internal build of Opera
I haven't heard anything on Internet Explorer support
but we can only hope.
So speaking of support for additional browsers,
I guess both in Socket.io and Push Your App,
you guys are doing some fallback techniques for older browsers.
Martin, why don't you speak for a moment
what you guys are doing in Push Your App for those?
Right.
We actually use a library which...
Sorry, I'll start again. Right. We actually use a library that is called... I need to look it up, actually. Sorry. Feel free to go first on Socket.io.
Yeah, Guillermo, why don't you jump over and take that.
Sure. The way that Socket.io works on the client, it uses feature detection for deciding what transport to use.
So if the WebSocket constructor is there, it will, of course, use WebSocket.
And on the server side, Node will trigger an upgrade event based on this handshake that is produced.
And, of course, the communication will happen normally like any other WebSocket server.
However, like Michael said,
there is limited support for WebSocket today,
and so we have to resort to other transports.
An example is called HTML5,
which is an iframe that is inserted into an ActiveX object component
so that the spinner in the browser is not triggered when fetching data from an iframe.
So this is all done transparently by Socket.io,
and this technique was actually discovered by or made popular by the Gmail chat engineers
a few years back.
And that's the sort of thing that I thought I would solve for you.
So how does that differ from Pusher app, Martin?
Yeah, so what we do on Pusher app is we use a library called WebSocket.js,
which uses a Flash socket to emulate a WebSocket effectively. It connects to the WebSocket server using all the same handshakes as a real browser-initiated WebSocket,
and it exposes the same API in JavaScript.
So we sense whether the WebSocket is available at the browser level.
If not, we use the WebSocket is available at the browser level. If not, we use the WebSocket.js.
And what we also do is we first initiate a non-secure,
a non-TLS WebSocket connection.
That fails for a large number of proxies, intermediary proxies,
which we'll probably come on to in the future.
And we fall back to a secure WebSocket in those cases.
And we'll put this in the show notes, but it looks like WebSocket
JS is another open source project
like Socket.io. Yes, yes it is.
Peter?
Go ahead.
Yeah, actually
Socket.io also uses Flash
Socket if Flash is available.
So like I said, using feature detection,
I can know if the
client has Flash installed
and ready to use, and I pick that one.
So Socket.io has also a
priority list based on how
bidirectional the transports
are. So it'll try with WebSocket
first, it'll try with Flash
second, which might fail
if the client is behind
the proxy, because the WebSocket protocol in their draft
specifies using the connect HTTP method to bypass proxies.
This cannot be done by Flash
because Flash doesn't have the information,
the authentication information of the proxy
is not given to Flash by the user agent.
So Flash WebSocket.js will fail behind proxies.
In that case, Socket.io will fall back to other transports like long polling, HTML file,
which have higher latency, and that's why they are lower on the list of priority.
So from an architecture standpoint, how would web sockets differ for something like
traditional long polling? Well, this is what essentially Socket.io solves. Those two methods
of communication are like really different. In one, you know that the socket will be open
and you have three events, connect, disconnect, and message. And with long polling, you essentially have many disconnections on the request side.
So there is a chance that the server might try to send a message to the client,
and the client is temporarily disconnected or he's between rare connections.
So a long polling request is closed.
The server tries to send a message before the client opens another one.
So that's another thing that Socket.io does.
It buffers messages that are sent between these disconnections by the client. And when the client
reconnects, it sends him a buffer, a chunk of messages while he was temporarily disconnected,
which might be a couple of milliseconds or, I don't know, depending on the client's connection,
it can be a long time. Can I ask you a quick question, Guillermo? How do you manage to, if you're going to scale this
and you need more than one Node.js server,
do you have to make sure that the request is sticky
so the reconnection comes back to the same process
in order for that buffering to work, or how do you do that?
Exactly. Yeah, for now, it's a single process.
Of course, you can put a message queue or a Redis server in front of it,
and you can make it scale to different nodes in terms of, like,
since the information you deal with will be in one process,
scaling takes a little more work, but it's definitely possible.
So another thing you could do is use an HTTP load balancer to direct the request to a particular server instance,
either using any of the headers that are in there
or inspecting other properties of the request.
Typically, this is harder to do on, you know,
if there are multiple processes on a single box.
But if your router is smart enough or if you have enough routing smarts in the manager on the box itself,
you can do that without actually needing to have a separate data store like a Redis or whatever.
Yeah, we're actually using a load balancer called HAProxy for PusherApp, which we find works extremely well.
We're using it in Layer 4 mode, but
you can
it allows you
to do a lot of the sticky sessions, support,
that kind of thing.
If you were to do
load balancing
within, say, Node, how would
you go about doing it? Would you still
use that load balancing server or would you go about doing it? Would you still use that load balancing server
or would you use some other technique?
You might want to answer that, Peter.
Sure.
So there are a couple of different ways of doing it within Node.
A lot of the frameworks that exist today,
like Connect or Multinode,
are both good at distributing incoming connections
among a bunch of different
processes. For these guys, they don't have any support for stickiness at all. So any incoming
request has a relatively equal chance of being served by any of the processes. So that doesn't
really get you what you want. What you can do instead is accept all connections in one process,
read part of the request, enough of the request to know
which process should be serving it,
and then go and send the socket
and the part of your request that you saw
off to the right worker.
I have a blog post up on how to do this
that can probably go out in the show notes or something,
but you can use that technique.
Is there anything intrinsically browser-dependent as far as the client side?
Because I know XHR really took off.
It made AJAX possible, right?
But I've seen that same technique, asynchronous calls in iPhone applications that are native applications.
Would this be something that someday may be used in a native mobile device?
Okay. Would this be something that someday may be used in a native mobile device? Okay, currently there is support in, well, sort of support in iPhone libraries.
There's, I think, two projects that give you the headers required to include WebSockets within your iPhone app.
Although I don't think they're currently supported.
I think they were drafted for iPhone 4, but they didn't make it in in time.
Someone might have more information on that.
As for the browser side, the main thing that needs to be done is for, A, the browsers to implement the protocol, and then to make sure that they actually communicate and use it
and actually do the communication of the protocol
with the server in the proper ways.
So Guillermo and Martin,
what types of applications are you guys seeing being built with Socket.io and with PushRap?
I've heard of a couple different ones.
There are some projects that build on top of Socket.io to give you APIs to build different things more easily.
Because essentially, Socket.io only gives you the Socket API.
So you need to do a little more to build an application.
Although you can build a thin protocol based on JSON,
pass JSON messages, and have a chat application
like the example that ships with Socket.io.
A really interesting one is called Dnode,
which does asynchronous remote method invocation.
This was created by the Stack VM guys,
and it's built on top of Socket.io,
and it's a good base for building applications.
I've also seen a chat application with video enabled by Flash
and avatars that move on the screen.
I've seen an Asteroids game built with Socket.io.
And recently I heard of someone trying to build a drawing application
which was passing many, many messages by many people at the same time.
Socket.io used to rely on JSON for doing message buffering.
So it would send you an array of messages and it would be parsed by JSON.
That turned out to be, like, very CPU-intensive, and it's been since removed in 0.5, which was released this week.
So, today, it's suitable for many different applications, from games to chat applications
or tying your data model
to making your data alive on browser
with something like Dnode.
From my point of view,
the way Push App actually came about
was that we had an application called True Story,
which is a collaborative application
to manage an Agile backlog.
And what we wanted is that we wanted,
so you could have edit stories in one browser
and those stories would be,
the changes would be reflected in another browser.
You could drag and drop, reorder, change sprints,
that kind of thing.
And so we actually, that's one of the reasons
that Pusher exposes a kind of event-binding API.
So in the browser where the event was being changed, or sorry, where the story was being changed,
we'd trigger an update call would go to your Rails application or whatever,
and that would send a story-updated event on the channel
to all the subscribers,
all the people who are viewing the backlog in their browsers.
So we've seen some people do collaborative applications like that.
The other thing we've seen a lot of on Pusher
is people who are just using Pusher
because it's so easy to send data out to browsers.
So real-time Twitter feeds, just real-time information.
Group Dashpon is one example
where real-time purchases on Groupon
are displayed on Google Maps, for example.
Another example we've got is another drawing application
where users can draw pictures on their iPad,
and those drawings are shown in real time on the web.
That's called WebPad.
Martin, talk a moment about channels in Pusher,
and how many channels would I have in an application? Is my app a channel, or would I have multiple channels in pusher and uh how many channels would i have in a an application is my app a channel or
would i have multiple channels in my app it very much depends actually on the application for
example the application i spoke about group dashpon um that would that i believe has a single channel
um which is group on purchases so everybody who's viewing that web page would be on would be
subscribed to that channel.
So there could be potentially hundreds of users subscribed to one channel
and pushing information out efficiently to all of those users via a single API call.
In the TrueStory collaborative application, there might be a single channel per backlog. So you might, in your web application,
you have domain objects which you want to share their state,
share state on those domain objects with other users.
So there may be, I don't know,
10 people using the application at the same time,
each, you know, two two of them viewing each backlog.
So there would be a channel for each of those.
But typically we're seeing it's not that there is...
In most cases, we don't have a single channel per user.
It's a channel per object which people collaborate on
or are interested in.
Okay, so I should also note that the channels that Martin is speaking of,
they're not actually built into the WebSocket protocol,
but rather a layer on top of them,
which I think you're still using URL-based routing or something?
No, what we do, we started with that approach, you're right.
The way we do it at the moment is that once the WebSocket is connected
the
JavaScript sends
a JSON event
I mean it's
just JSON but it's
it has an event name
which is push or subscribe
and the name of the channel
and then internally of the channel.
And then internally in the socket server, we then subscribe to the queue that publishes those events.
And so, yes, you're absolutely right.
Channels are an abstraction, which we've added on top of WebSockets. And the other abstraction we've added is the idea of being able to trigger events
and have those events then triggered in JavaScript.
So it's just a matter of saying push a.bind event name
and then the anonymous function which you want to be executed.
But you're absolutely right.
These are things that we've added on
because our applications, they all needed that kind of thing.
Okay, so going off the idea of triggering events and things like that,
there's also a new HTML5 protocol, which is called EventSource,
which allows you to trigger events on the browser from the server.
I'm not sure if it's bidirectional. Peter,
would you have any more information on that?
I wish I did.
The way I understand
it is event source is
pretty much a one-way web
socket. That's the idea.
From a JavaScript API
point of view, that's what you get.
So you can only receive events from the server.
You can't push events to the server.
Right, exactly.
So the way that you send messages to the server is actual, like, normal AJAX.
It's very similar to the multi-part flag in the XML HTTP request object,
which is only supported by Firefox.
That is also implemented by Socket.io, and
that gives you a single way, we can say WebSocket, a connection that is always open and pushing
parts of messages.
In that respect, EventSource is very similar to it.
And it's my understanding that it's only implemented in Opera so far.
I'm not sure.
Sometimes the technology comes along and it forces us to take a fresh look at how we solve some problems.
I know the NoSQL database movement has done that for me,
and that now when I model my data in the database, it really changes the way I look at the application as a whole.
Peter, talk about what WebSockets does
to how you architect applications at Yahoo.
Sure.
So what we're interested in with WebSockets is,
you know, once browsers actually support this thing,
it'll provide a first-class API that you can always use
and always expect to work,
as opposed to jumping through all the hoops
that Guillermo has done a great job of doing and building a library that can kind of handle
all the different browser use cases, proxy use cases, different performance and connection
limitations that different browsers have.
You know, there's kind of a whole world of stuff that you need to try to navigate with
the current set of ways that you can have this kind of full duplex communication.
You know, it is doable now, and gear mode stuff is, you know, kind of living proof of that.
But the promise of WebSockets is a unified API that you can expect to work,
at least in a small number of years once browser and proxy support is there for that.
Before we go around the horn and ask what's on each of your open source radars,
Michael, why don't you give a shout out for your own WebSocket server
and then kind of list some resources that the WebSocket noob, including myself,
could go and check out.
Okay, so I do actually run my own WebSocket server, Node WebSocket server. It's different to Guilomero's
in that rather than adding support
for all the backwards compatible transport methods,
it just gives you the WebSocket connections.
And then as for resources,
probably the best place to find out more about WebSockets
would have to be the protocol outline,
which is in the Whatwig working group.
Well, yeah, Web Apps working group,
which is sort of part of the W3C, but not really.
And they're the ones authoring the specification,
which is being led by Ian Hickson at the moment.
Then there's also a fair few other resources
that we'll link to in the show notes.
I don't have URLs offhand.
Or socket.io and pushyourapp.com, right?
Well, this is the part of the episode
where we kind of turn it upside down and ask what's
on our guest's open source radars.
We'll start with you, Guillermo.
What open source projects have got you excited and that you want to go play with?
Well, I don't know if you guys have seen the Hummingbird demo for Node, which is basically
WebSocket and MongoDB for real-time analytics.
We're actually also users of MongoDB and we developed our own ORM.
And what we're hoping to release in the upcoming months is an easy way to build web applications
that have data displaying on the browser, which is updated all the time based on socket.io and
server push, and
of course, real-time.
Aside from that,
in general, I think
it's interesting to watch all the
Node.js-related projects
since Node makes it really easy to
build this kind of
real-time applications
and modules.
What about you, Martin?
I think the thing that's really exciting me at the moment is Redis.
All the projects I've worked on recently I've used Redis in,
and it's just incredibly liberating to have a really fast atomic data store that I can share between multiple processes.
Another open source,
I should mention that we're using EM WebSocket,
which is a Ruby.
If you're interested in a Ruby event machine client,
then that's a great one to look at.
Peter?
A couple things.
So the Node.js YUI-free bindings are, I think, really exciting
because they let you have this really rich set of tools
that you can run both on browser and on the server.
And for doing things that you would normally only really think about doing in a browser,
like you can take these really complex web applications written in YUI
and decide that you're going to render them statically on the server if the client has low bandwidth and you don't want to deal with downloading all the JavaScript or the client has a CPU that's not particularly strong.
And so you just want to hand it some static HTML and make life really easy for it.
So I think that provides a really kind of compelling platform for building user experiences that can handle a wide variety of
clients. Another thing that I'm starting to watch and is actually really new and I think is an
interesting fit for WebSockets is TeleHash. This is Jeremy Miller, the creator of XMPP.
This is his distributed JSON routing protocol. It's really, really early going right now.
You know, there's only a basic protocol up
and a couple of really kind of bare bones implementations,
but it looks like a really kind of neat way
of shooting around data.
With all of these tools,
do you think the application landscape
for the web developers getting easier
or more complicated?
I mean, you know, some things are easier.
You know, in some sense,
the promise of WebSockets is it will become easy to build these real time full duplex pipes,
whereas now it is possible, but it's just hard. So, you know, there are, I guess, more choice
does make things difficult to some extent, but, you know, with great power comes great responsibility.
All right. Thanks, everyone, for joining us today. And we'll see you in cyberspace. See it in my eyes
So how could I forget when
I found myself for the first time
Safe in your arms
As a dog I should