The Changelog: Software Development, Open Source - The WebSocket protocol (Interview)

Episode Date: August 9, 2010

Wynn and Micheil sat down with Peter Griess from Yahoo Mail, Martyn Loughran from Pusher App, and Guillermo Rauch from Socket.IO to talk about Websockets....

Transcript
Discussion (0)
Starting point is 00:00:00 Welcome to the ChangeLog episode 0.3.1. I'm Adam Stachowiak. And I'm Winn Netherland. This is the ChangeLog. We cover what's fresh and new in the world of open source. If you found us on iTunes, we're also on the web at thechangelog.com. Also head to github.com forward slash explore. You'll find some trendy repos, some feature repos from our blog and the audio podcasts. If you're on Twitter, follow Change Log Show, not The Change Log. And I'm Adam Stack. And I'm Penguin, P-E-N-G-W-I-N-N.
Starting point is 00:00:41 Fun episode this week talking web soets with some experts in the area. Peter over at Yahoo, Martin from PushYourApp, and Guillermo from Socket.io, along with guest host Michael Smith from way down under again. Yeah, that's a nice lineup there. Yeah, Michael's got a Project Node WebSocket server that is an implementation
Starting point is 00:01:00 of WebSocket server-side, so I guess we should mention what WebSockets are. Yeah, what is it? It's a persistent connection between the browser and the server so that you can do server push down to the browser and they can just open that long-running connection and it's more two-way bidirectional communication between client and server
Starting point is 00:01:16 without having to do long polling or AJAX techniques like we currently do. Right, so the idea is to move away from the AJAX piece of it, speed it up, and be more native? Yeah. Especially as more and more apps are poised to go real-time. This is just the ever-evolving landscape of web development. Cool.
Starting point is 00:01:35 Fun episode this week. Should we get to it? Let's do it. All right, we're joined today by Peter, Martin, and Guillermo, and Michael Smith to talk about HTML5 WebSockets. Before we dive right in, let's go around the horn and each of you introduce yourself, kind of what you do in the landscape of WebSockets, where you work, and your role there. Peter, let's start with you. So I'm Peter Grice. I work at Yahoo. I'm a principal engineer. I work on mail. We're looking at using WebSockets for a couple of different things. We don't have anything in production yet because obviously WebSockets is pretty new, but we're looking at
Starting point is 00:02:14 using it for adding different real-time features to mail, message notifications, other things like that, and also doing some experiments with using it to accelerate attachment uploads. All right, Martin? Hi there. I work at a company called New Bamboo. We're a Ruby shop in London. So as well as client projects, we've got a few products that we're working on at the moment. One of those is called Pusher App. And the idea of Pusher App is making it really simple for people to be able to push events to
Starting point is 00:02:47 browsers we have an api which you push events to and those are then via a pub sub model sent to send to browsers we've used this internally on a few of our on a few of our client projects and and our other products and it's been working really nicely. I should say WebSockets. We use WebSockets to implement Pusher app. Guillermo? Hello. My name is Guillermo Rauch. I'm the CTO of LearnBoost, an education startup in San Francisco.
Starting point is 00:03:19 And I created Socket.io, which is uh two projects one is the uh a client that provides uh websocket like api on on the browser that um basically gives you websocket um and a bunch of our different transports um in a way in this in a way that like jquery provides dollar aj, and they give you XML HTTP requests for standard compliant browsers, and ActiveX object for Internet Explorer. I do the same thing for many different browsers and many different transports. But the developer thinks it's a WebSocket-like API. On the server side, I created Socket.io-node, which is an implementation of all these different types of requests
Starting point is 00:04:09 so that you also develop as if you were receiving data from a socket. Perfect. So regular listeners of the ChangeLog will know that we normally cover projects on the podcast, but sometimes we take a step back and cover broader topics. And we'll dive into some of your projects in just a moment. But, Michael, why don't you give an overview for those that might not know what WebSockets is and why it should get us excited? Okay, sure.
Starting point is 00:04:38 WebSockets are a new piece of technology that is currently falling under the umbrella term of HTML5. They're basically a way to get bidirectional communication between your web browser and a server, and there's no need for constantly opening up new connections or things like that. So very fast, very real-time, quite similar to almost having a TCP socket although there's a bit more to it than that and currently we've got browser support in Safari, Chrome, Firefox
Starting point is 00:05:15 4 is coming I've heard and same with an internal build of Opera I haven't heard anything on Internet Explorer support but we can only hope. So speaking of support for additional browsers, I guess both in Socket.io and Push Your App, you guys are doing some fallback techniques for older browsers. Martin, why don't you speak for a moment
Starting point is 00:05:37 what you guys are doing in Push Your App for those? Right. We actually use a library which... Sorry, I'll start again. Right. We actually use a library that is called... I need to look it up, actually. Sorry. Feel free to go first on Socket.io. Yeah, Guillermo, why don't you jump over and take that. Sure. The way that Socket.io works on the client, it uses feature detection for deciding what transport to use. So if the WebSocket constructor is there, it will, of course, use WebSocket. And on the server side, Node will trigger an upgrade event based on this handshake that is produced.
Starting point is 00:06:22 And, of course, the communication will happen normally like any other WebSocket server. However, like Michael said, there is limited support for WebSocket today, and so we have to resort to other transports. An example is called HTML5, which is an iframe that is inserted into an ActiveX object component so that the spinner in the browser is not triggered when fetching data from an iframe. So this is all done transparently by Socket.io,
Starting point is 00:06:56 and this technique was actually discovered by or made popular by the Gmail chat engineers a few years back. And that's the sort of thing that I thought I would solve for you. So how does that differ from Pusher app, Martin? Yeah, so what we do on Pusher app is we use a library called WebSocket.js, which uses a Flash socket to emulate a WebSocket effectively. It connects to the WebSocket server using all the same handshakes as a real browser-initiated WebSocket, and it exposes the same API in JavaScript. So we sense whether the WebSocket is available at the browser level.
Starting point is 00:07:43 If not, we use the WebSocket is available at the browser level. If not, we use the WebSocket.js. And what we also do is we first initiate a non-secure, a non-TLS WebSocket connection. That fails for a large number of proxies, intermediary proxies, which we'll probably come on to in the future. And we fall back to a secure WebSocket in those cases. And we'll put this in the show notes, but it looks like WebSocket JS is another open source project
Starting point is 00:08:09 like Socket.io. Yes, yes it is. Peter? Go ahead. Yeah, actually Socket.io also uses Flash Socket if Flash is available. So like I said, using feature detection, I can know if the
Starting point is 00:08:25 client has Flash installed and ready to use, and I pick that one. So Socket.io has also a priority list based on how bidirectional the transports are. So it'll try with WebSocket first, it'll try with Flash second, which might fail
Starting point is 00:08:41 if the client is behind the proxy, because the WebSocket protocol in their draft specifies using the connect HTTP method to bypass proxies. This cannot be done by Flash because Flash doesn't have the information, the authentication information of the proxy is not given to Flash by the user agent. So Flash WebSocket.js will fail behind proxies.
Starting point is 00:09:09 In that case, Socket.io will fall back to other transports like long polling, HTML file, which have higher latency, and that's why they are lower on the list of priority. So from an architecture standpoint, how would web sockets differ for something like traditional long polling? Well, this is what essentially Socket.io solves. Those two methods of communication are like really different. In one, you know that the socket will be open and you have three events, connect, disconnect, and message. And with long polling, you essentially have many disconnections on the request side. So there is a chance that the server might try to send a message to the client, and the client is temporarily disconnected or he's between rare connections.
Starting point is 00:09:56 So a long polling request is closed. The server tries to send a message before the client opens another one. So that's another thing that Socket.io does. It buffers messages that are sent between these disconnections by the client. And when the client reconnects, it sends him a buffer, a chunk of messages while he was temporarily disconnected, which might be a couple of milliseconds or, I don't know, depending on the client's connection, it can be a long time. Can I ask you a quick question, Guillermo? How do you manage to, if you're going to scale this and you need more than one Node.js server,
Starting point is 00:10:31 do you have to make sure that the request is sticky so the reconnection comes back to the same process in order for that buffering to work, or how do you do that? Exactly. Yeah, for now, it's a single process. Of course, you can put a message queue or a Redis server in front of it, and you can make it scale to different nodes in terms of, like, since the information you deal with will be in one process, scaling takes a little more work, but it's definitely possible.
Starting point is 00:11:09 So another thing you could do is use an HTTP load balancer to direct the request to a particular server instance, either using any of the headers that are in there or inspecting other properties of the request. Typically, this is harder to do on, you know, if there are multiple processes on a single box. But if your router is smart enough or if you have enough routing smarts in the manager on the box itself, you can do that without actually needing to have a separate data store like a Redis or whatever. Yeah, we're actually using a load balancer called HAProxy for PusherApp, which we find works extremely well.
Starting point is 00:11:46 We're using it in Layer 4 mode, but you can it allows you to do a lot of the sticky sessions, support, that kind of thing. If you were to do load balancing within, say, Node, how would
Starting point is 00:12:01 you go about doing it? Would you still use that load balancing server or would you go about doing it? Would you still use that load balancing server or would you use some other technique? You might want to answer that, Peter. Sure. So there are a couple of different ways of doing it within Node. A lot of the frameworks that exist today, like Connect or Multinode,
Starting point is 00:12:21 are both good at distributing incoming connections among a bunch of different processes. For these guys, they don't have any support for stickiness at all. So any incoming request has a relatively equal chance of being served by any of the processes. So that doesn't really get you what you want. What you can do instead is accept all connections in one process, read part of the request, enough of the request to know which process should be serving it, and then go and send the socket
Starting point is 00:12:51 and the part of your request that you saw off to the right worker. I have a blog post up on how to do this that can probably go out in the show notes or something, but you can use that technique. Is there anything intrinsically browser-dependent as far as the client side? Because I know XHR really took off. It made AJAX possible, right?
Starting point is 00:13:13 But I've seen that same technique, asynchronous calls in iPhone applications that are native applications. Would this be something that someday may be used in a native mobile device? Okay. Would this be something that someday may be used in a native mobile device? Okay, currently there is support in, well, sort of support in iPhone libraries. There's, I think, two projects that give you the headers required to include WebSockets within your iPhone app. Although I don't think they're currently supported. I think they were drafted for iPhone 4, but they didn't make it in in time. Someone might have more information on that. As for the browser side, the main thing that needs to be done is for, A, the browsers to implement the protocol, and then to make sure that they actually communicate and use it
Starting point is 00:14:09 and actually do the communication of the protocol with the server in the proper ways. So Guillermo and Martin, what types of applications are you guys seeing being built with Socket.io and with PushRap? I've heard of a couple different ones. There are some projects that build on top of Socket.io to give you APIs to build different things more easily. Because essentially, Socket.io only gives you the Socket API. So you need to do a little more to build an application.
Starting point is 00:14:47 Although you can build a thin protocol based on JSON, pass JSON messages, and have a chat application like the example that ships with Socket.io. A really interesting one is called Dnode, which does asynchronous remote method invocation. This was created by the Stack VM guys, and it's built on top of Socket.io, and it's a good base for building applications.
Starting point is 00:15:16 I've also seen a chat application with video enabled by Flash and avatars that move on the screen. I've seen an Asteroids game built with Socket.io. And recently I heard of someone trying to build a drawing application which was passing many, many messages by many people at the same time. Socket.io used to rely on JSON for doing message buffering. So it would send you an array of messages and it would be parsed by JSON. That turned out to be, like, very CPU-intensive, and it's been since removed in 0.5, which was released this week.
Starting point is 00:15:57 So, today, it's suitable for many different applications, from games to chat applications or tying your data model to making your data alive on browser with something like Dnode. From my point of view, the way Push App actually came about was that we had an application called True Story, which is a collaborative application
Starting point is 00:16:25 to manage an Agile backlog. And what we wanted is that we wanted, so you could have edit stories in one browser and those stories would be, the changes would be reflected in another browser. You could drag and drop, reorder, change sprints, that kind of thing. And so we actually, that's one of the reasons
Starting point is 00:16:44 that Pusher exposes a kind of event-binding API. So in the browser where the event was being changed, or sorry, where the story was being changed, we'd trigger an update call would go to your Rails application or whatever, and that would send a story-updated event on the channel to all the subscribers, all the people who are viewing the backlog in their browsers. So we've seen some people do collaborative applications like that. The other thing we've seen a lot of on Pusher
Starting point is 00:17:20 is people who are just using Pusher because it's so easy to send data out to browsers. So real-time Twitter feeds, just real-time information. Group Dashpon is one example where real-time purchases on Groupon are displayed on Google Maps, for example. Another example we've got is another drawing application where users can draw pictures on their iPad,
Starting point is 00:17:52 and those drawings are shown in real time on the web. That's called WebPad. Martin, talk a moment about channels in Pusher, and how many channels would I have in an application? Is my app a channel, or would I have multiple channels in pusher and uh how many channels would i have in a an application is my app a channel or would i have multiple channels in my app it very much depends actually on the application for example the application i spoke about group dashpon um that would that i believe has a single channel um which is group on purchases so everybody who's viewing that web page would be on would be subscribed to that channel.
Starting point is 00:18:25 So there could be potentially hundreds of users subscribed to one channel and pushing information out efficiently to all of those users via a single API call. In the TrueStory collaborative application, there might be a single channel per backlog. So you might, in your web application, you have domain objects which you want to share their state, share state on those domain objects with other users. So there may be, I don't know, 10 people using the application at the same time, each, you know, two two of them viewing each backlog.
Starting point is 00:19:06 So there would be a channel for each of those. But typically we're seeing it's not that there is... In most cases, we don't have a single channel per user. It's a channel per object which people collaborate on or are interested in. Okay, so I should also note that the channels that Martin is speaking of, they're not actually built into the WebSocket protocol, but rather a layer on top of them,
Starting point is 00:19:34 which I think you're still using URL-based routing or something? No, what we do, we started with that approach, you're right. The way we do it at the moment is that once the WebSocket is connected the JavaScript sends a JSON event I mean it's just JSON but it's
Starting point is 00:19:58 it has an event name which is push or subscribe and the name of the channel and then internally of the channel. And then internally in the socket server, we then subscribe to the queue that publishes those events. And so, yes, you're absolutely right. Channels are an abstraction, which we've added on top of WebSockets. And the other abstraction we've added is the idea of being able to trigger events and have those events then triggered in JavaScript.
Starting point is 00:20:27 So it's just a matter of saying push a.bind event name and then the anonymous function which you want to be executed. But you're absolutely right. These are things that we've added on because our applications, they all needed that kind of thing. Okay, so going off the idea of triggering events and things like that, there's also a new HTML5 protocol, which is called EventSource, which allows you to trigger events on the browser from the server.
Starting point is 00:21:06 I'm not sure if it's bidirectional. Peter, would you have any more information on that? I wish I did. The way I understand it is event source is pretty much a one-way web socket. That's the idea. From a JavaScript API
Starting point is 00:21:21 point of view, that's what you get. So you can only receive events from the server. You can't push events to the server. Right, exactly. So the way that you send messages to the server is actual, like, normal AJAX. It's very similar to the multi-part flag in the XML HTTP request object, which is only supported by Firefox. That is also implemented by Socket.io, and
Starting point is 00:21:46 that gives you a single way, we can say WebSocket, a connection that is always open and pushing parts of messages. In that respect, EventSource is very similar to it. And it's my understanding that it's only implemented in Opera so far. I'm not sure. Sometimes the technology comes along and it forces us to take a fresh look at how we solve some problems. I know the NoSQL database movement has done that for me, and that now when I model my data in the database, it really changes the way I look at the application as a whole.
Starting point is 00:22:23 Peter, talk about what WebSockets does to how you architect applications at Yahoo. Sure. So what we're interested in with WebSockets is, you know, once browsers actually support this thing, it'll provide a first-class API that you can always use and always expect to work, as opposed to jumping through all the hoops
Starting point is 00:22:44 that Guillermo has done a great job of doing and building a library that can kind of handle all the different browser use cases, proxy use cases, different performance and connection limitations that different browsers have. You know, there's kind of a whole world of stuff that you need to try to navigate with the current set of ways that you can have this kind of full duplex communication. You know, it is doable now, and gear mode stuff is, you know, kind of living proof of that. But the promise of WebSockets is a unified API that you can expect to work, at least in a small number of years once browser and proxy support is there for that.
Starting point is 00:23:23 Before we go around the horn and ask what's on each of your open source radars, Michael, why don't you give a shout out for your own WebSocket server and then kind of list some resources that the WebSocket noob, including myself, could go and check out. Okay, so I do actually run my own WebSocket server, Node WebSocket server. It's different to Guilomero's in that rather than adding support for all the backwards compatible transport methods, it just gives you the WebSocket connections.
Starting point is 00:23:58 And then as for resources, probably the best place to find out more about WebSockets would have to be the protocol outline, which is in the Whatwig working group. Well, yeah, Web Apps working group, which is sort of part of the W3C, but not really. And they're the ones authoring the specification, which is being led by Ian Hickson at the moment.
Starting point is 00:24:27 Then there's also a fair few other resources that we'll link to in the show notes. I don't have URLs offhand. Or socket.io and pushyourapp.com, right? Well, this is the part of the episode where we kind of turn it upside down and ask what's on our guest's open source radars. We'll start with you, Guillermo.
Starting point is 00:24:49 What open source projects have got you excited and that you want to go play with? Well, I don't know if you guys have seen the Hummingbird demo for Node, which is basically WebSocket and MongoDB for real-time analytics. We're actually also users of MongoDB and we developed our own ORM. And what we're hoping to release in the upcoming months is an easy way to build web applications that have data displaying on the browser, which is updated all the time based on socket.io and server push, and of course, real-time.
Starting point is 00:25:31 Aside from that, in general, I think it's interesting to watch all the Node.js-related projects since Node makes it really easy to build this kind of real-time applications and modules.
Starting point is 00:25:46 What about you, Martin? I think the thing that's really exciting me at the moment is Redis. All the projects I've worked on recently I've used Redis in, and it's just incredibly liberating to have a really fast atomic data store that I can share between multiple processes. Another open source, I should mention that we're using EM WebSocket, which is a Ruby. If you're interested in a Ruby event machine client,
Starting point is 00:26:17 then that's a great one to look at. Peter? A couple things. So the Node.js YUI-free bindings are, I think, really exciting because they let you have this really rich set of tools that you can run both on browser and on the server. And for doing things that you would normally only really think about doing in a browser, like you can take these really complex web applications written in YUI
Starting point is 00:26:43 and decide that you're going to render them statically on the server if the client has low bandwidth and you don't want to deal with downloading all the JavaScript or the client has a CPU that's not particularly strong. And so you just want to hand it some static HTML and make life really easy for it. So I think that provides a really kind of compelling platform for building user experiences that can handle a wide variety of clients. Another thing that I'm starting to watch and is actually really new and I think is an interesting fit for WebSockets is TeleHash. This is Jeremy Miller, the creator of XMPP. This is his distributed JSON routing protocol. It's really, really early going right now. You know, there's only a basic protocol up and a couple of really kind of bare bones implementations,
Starting point is 00:27:30 but it looks like a really kind of neat way of shooting around data. With all of these tools, do you think the application landscape for the web developers getting easier or more complicated? I mean, you know, some things are easier. You know, in some sense,
Starting point is 00:27:45 the promise of WebSockets is it will become easy to build these real time full duplex pipes, whereas now it is possible, but it's just hard. So, you know, there are, I guess, more choice does make things difficult to some extent, but, you know, with great power comes great responsibility. All right. Thanks, everyone, for joining us today. And we'll see you in cyberspace. See it in my eyes So how could I forget when I found myself for the first time Safe in your arms As a dog I should

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.