The Changelog: Software Development, Open Source - Fog, the Ruby Cloud Services Library (Interview)

Episode Date: May 20, 2011

Wynn sat down with Wesley Beary from Engine Yard to talk about the Fog project and the Cloud, live from Red Dirt Ruby Conf....

Transcript
Discussion (0)
Starting point is 00:00:00 Welcome to the Changelog episode 0.6.0. I'm Adam Stachowiak. And I'm Winn Netherland. This is the Changelog. We cover what's fresh and new in open source. If you found us on iTunes, we're also on the web at thechangelog.com. We're also up on GitHub. Head to github.com slash explore. You'll find some trending repos, some feature repos from the blog, as well as our audio podcasts. If you're on Twitter, follow Change Log Show. And me, Adam Stack.
Starting point is 00:00:37 And I'm Penguin, P-E-N-G-W-Y-N-N. And this episode is sponsored by GitHub Jobs. Head to thechangelog.com slash jobs to get started. If you'd like to feature your job on this show, select advertise on the changelog when you post your job, and we'll take care of the rest. A couple of jobs up this week by Mag10. They're rethinking publishing without the analog boundaries.
Starting point is 00:00:57 The first is a senior iOS developer, architect prototype, and release complex iPad and iPhone apps. Your rig will include standard MacBook Pro Air Synapse display, architect prototype and release complex iPad and iPhone apps. Your rig will include standard MacBook Pro, Air Cinema Display, iPad, and iPhone, and stock options. But if you're on the web side, Mac 10's also looking for software engineers for their web front end. Must be familiar with HTML5, CSS3, and JavaScript. Same rig applies as the ios position if you're
Starting point is 00:01:28 interested in these two lg.gd slash a o and a p and next up is tag tag dink is the number one place to make new friends online they're looking for a software engineer full-time in the san fran area that's san francisco california their stack is php Java, Oracle, and HTML and AJAX on the client side. They're currently looking for better ways to do many, many things. They're building APIs. They're reducing spam, scams, and phishing, and much, much more. So if you have a BS or an MS in computer science or related fields, they have competitive salaries to offer you,
Starting point is 00:02:04 generous stock options, quarterly compensation, as well as a 401k. Check out lg.gd.at. Fun episode this week. Talked to Wesley Beery over at Engine Yard. This is the last of our Red Dirt RubyConf recordings. So long, Oklahoma City, till next year. But we talked about his Fog Gym, a little bit about Xcon,
Starting point is 00:02:26 and how the Fog gem came about. So Fog is kind of the Uber wrapper for all the cloud APIs from a Ruby perspective. And this is also our 60th episode. 60th episode. Can you believe that? That's a lot of numbers. That's three digits, a couple points in there. It's a point release, too, so that's cool. Thanks, everyone, for putting up with us this long.
Starting point is 00:02:48 Hopefully we've got another 60 episodes in us at least. And we've got some fun upcoming things happening too as well on the advertising side, so stay tuned to that and some fun new episodes coming up. Absolutely. And as far as conferences coming up, I'll be at Texas JS next month and then Big D in Dallas in July. How about you, Adam? Any plans? I'll be there as well, Big D.
Starting point is 00:03:13 And I'm also going to New York for a design conference. But I'm not sure if I'll attend, but I'm definitely going to be in New York later in this year, August. Awesome. Fun episode. Should we get to it? Let's do it. Thanks for coming out to a special live edition of the Change Log podcast with video this year. So in addition to the dozens of people in the room, probably half as many on the interwebs watching this live.
Starting point is 00:03:48 And I hope to get a couple of episodes out of this. So in case you don't know, this is not Adam Stachowiak, my partner in crime usually. This is Wesley Vary of the Fog Gym. I'm going to chat a little bit about fog and all things cloud. I think we should start with the fog in the room of AWS outage today and reactions to half the interweb going down. It's been kind of a mess, as you well know. I mean, that's one of the things you have to worry about more in the cloud is you gain a lot of flexibility and power, but you lose a certain amount of control and knowledge of what's actually going on behind the scenes. So unfortunately, that sometimes means stuff will go down
Starting point is 00:04:31 and there's not really very much you can do about it. It's not like you can drive over to the data center and swap out some hard drives or something. There's just not that much you can do. But I think the real testament to services like that is that when it goes down, it is such a big deal. Like, it's happened so infrequently that when it does, everyone is very surprised and terrified
Starting point is 00:04:53 and whatnot. So, I mean, I think the vast majority of the time it works very well. For those that might not know, tell us a bit about, I guess, Fog as its scope and a little bit of background on how the project came about. Sure. So Fog is probably the biggest yak shave that I've ever participated in. It started just as, ironically, it started with me wanting to know more about cloud services in general, but specifically about SimpleDB.
Starting point is 00:05:23 As it turns out, SimpleDB is a very small and relatively unimportant portion of Fog at this point. But that was the impetus, is that there wasn't a good Ruby binding to SimpleDB. I wanted to play with SimpleDB, so I wrote one. And then pretty soon, I was like, well, this is interesting. I don't know how interesting, but it's kind of interesting.
Starting point is 00:05:43 And I want to play with S3. And then I started looking at some of the existing tools and had some level of dissatisfaction about how maintained they were and how up to date they were and how open the processes were around that open source stuff, like whether or not I could help to make them be maintained or help to bring them up to date. And it seemed like there was a lot of open question there.
Starting point is 00:06:03 So I started just writing some of my own S3 stuff. And I had reused some of the lessons I'd learned from doing the simple DB one. And so this kind of continued on and on. I never really had a particular reason to need these services. I was just very curious. I wanted to learn Cloud.
Starting point is 00:06:19 Seemed like a good way to do it. And then pretty soon I had all these services. And then Rackspace servers came out and I was like, well, this is interesting. I also want to try this. And so pretty soon I had a Rackspace servers implementation. And before too long, I had multiple implementations of, say, compute. And I realized that it was a huge pain to switch back and forth between them. I mean, like if you've ever tried to switch back and forth between them at all, it becomes very clear. It's not difficult to imagine that this is a hard problem.
Starting point is 00:06:50 And so then, all of a sudden, there was sort of this use case. There was this purpose for Fog, which was, I think I can actually make that problem easier. I have this strong foundation to build on top of, so let me use that to provide things that will actually make this transition easier and make these things more comparable. How many providers do you support now in addition to EC2 and Rackspace?
Starting point is 00:07:13 I don't even know a number offhand. It seems to continue growing. I've been lucky enough to recently, there's been some providers where they actually were interested enough in the project and the community that they just said, here is an implementation of our service. Could you please include it?
Starting point is 00:07:29 Like, there was very little work that I had to do. Both Blue Box and Bright Box have been kind enough to do that with their pretty recent cloud offerings, which has been awesome. So for a while, it was just kind of like whatever service piqued my interest, the same as the EC2 and Rackspace servers case of like, oh, this seems interesting, I want to check that out.
Starting point is 00:07:47 Or, oh, I've had four people ask me about this, so maybe I'll go and look into it. But yeah, more and more it's the providers driving it. So it's become a pretty large number. I don't know that I can put my finger right on it. But between all of the different things, it's probably, I don't know, like 15 to 20, not just compute providers, but because there's also storage and DNS, there's some distribution of services
Starting point is 00:08:11 that are on one provider for DNS, but a different provider for something else. Sorry, before we started, Dr. Nick came up and said, I'm not sure if you're aware, but AWS has been down all day. And this might be because you have a lot of these services stubbed out when you're testing. So when did that come about? And was it just a large EC2 build that spawned the mocking? So, yeah, there's a lot of mocking underneath the covers in Fog.
Starting point is 00:08:36 And the idea is that you can kind of run against these services in a more simulated manner. And that actually came from the usage of EC2 that we had at Engine Yard. So I started Fog prior to starting at Engine Yard. And then I joined the App Cloud team, which makes pretty heavy usage of EC2. We're probably one of the larger consumers as
Starting point is 00:08:58 individuals, because we sort of proxy all the traffic of our customers effectively into it. And so they had built a solution on top of right at AWS. They provided a lot of mocking because they got a lot more kind of bang for the buck in terms of testing. Because you don't want to really have to wait for a server to spin up for each of your unit tests and then break it back down again or something because servers can take minutes to spin up. And if you add minutes to the before each filter in your R spec, you're not going to get your test suite done once a week or something.
Starting point is 00:09:31 You'll see whether or not it's green, and then what do you do? So it came from that need. I knew that if I was going to get Engine Yard to adopt Fog because I felt that there were a lot of other merits for it in terms of performance and stability and maintainability and that sort of thing that I needed to be able to provide the mocks as well so that it could be closer to a drop-in replacement. But it's also provided a lot of utility
Starting point is 00:09:53 in terms of, oh, I want to just, you know, start to hack out some scripts against this without necessarily having to worry about whether I forget to spend down all the servers afterward or something like it. It can be a very nice playground, kind of sandbox environment as well. It's been the biggest boost other than, I guess,
Starting point is 00:10:09 food on the table of the project before and after the move to Engine Yard Backing. It's been very interesting. I mean, a lot of it has just been Dr. Nick has been really great in terms of providing support, and he has a lot of very good ideas, and there's a lot of good back and forth with him that's been very helpful. There's also been a lot of good back and forth
Starting point is 00:10:31 that I'm sure I could have had, but I kind of have the in now with the Rubinius guys and with the JRuby guys, where they've kind of been around the block and done this a little bit longer than me. So I'll be like, I'm thinking about doing this or that with my community. I'm thinking about doing this or that with my community. I'm thinking about what I should do with my commit bit.
Starting point is 00:10:48 What are you guys doing? How is that working for you? Would you recommend it? So that's the more obvious part. The maybe less obvious part is there's a lot of things as a hobby project that you don't necessarily get around to because they aren't as fun. So I'm terrible about this in terms of the documentation.
Starting point is 00:11:05 I know it could use a lot of work. People tell me that all the time. I feel guilty and bad about it. But when it was my hobby project, I didn't want to come home from work and spend two hours writing documentation. I wanted to hack on whatever cool new cloud thing was going on. And so having it be my full-time job means that I'm
Starting point is 00:11:22 still not the best about it, but it's easier for me to dedicate an hour or two a day or at least a few hours a week to just trying to at least polish that up a little bit. That's been a boon as well. One of the unique things that I've heard you do that I'm not sure other projects do this, you base the t-shirt on whether or not you're a committer or a friend of Fog. How did that idea come about? I don't know exactly what spawned that on. I mean, you've got to earn one of these, right?
Starting point is 00:11:52 You do have to earn the shirts. I mean, to some extent, I've been a long time gamer. And so to some extent, it's this kind of idea of almost like the achievements or badges that you have on a lot of other services of like, this is just like a physical badge, right? So, um, in my case, like you can get a blue shirt if you do something that's kind of supporting, but not directly code related for fog. You can get gray if you get something accepted into fog and then you can get black if you become a committer, which I'm terrible about in terms of that still is only me. Like I need to give commit out to
Starting point is 00:12:24 other people. That's a difficult problem. It's one I've discussed with a terms of that still is only me. I need to give commit out to other people. That's a difficult problem. It's one I've discussed with a lot of people. But yeah, it's worked out really well, I think. People really like the shirts. They respond to them well. It's not really that expensive because, as it turns out, most open source projects don't get the thousands of committers
Starting point is 00:12:44 that, say, Rails does. I still, at this point, have gotten to a total of 50 or 60 contributors. And I mean, that's not free. But 50 or 60 t-shirts is well worth it, in my mind, for the amount of extra support. And it makes me feel good to have these people come in and help me out.
Starting point is 00:13:02 Because there's so much that I can't do on my own. And the value of it is way more than the $10 or $12 or whatever that I spend on a t-shirt for them. So we heard earlier today that Aaron Patterson said that the plus one is the most useless comment that you can put on a pull request. You've got 152 forks of fog and only four pull requests. How do you manage that queue? Is it all you?
Starting point is 00:13:24 Right now, at least, it is all me. I'm pretty responsive to it. I'm lucky in that a lot of the pull requests are very small and fairly obvious. In a lot of cases, Fog has a pretty large scope, but most of the time, if someone is fixing an issue,
Starting point is 00:13:41 they tend to be like, I'm using service foo, and when I do X, I expect Y, but I'm actually getting Z. So here is, you know, like the two-line fix that gives me back what I expect from this one particular request. So that actually makes those a lot easier to get through. I just have, I usually basically get in in the morning and do pull requests, respond to issues, and all of that. I do all of that before I ever let myself code. So that helps me to make sure that I stay on top of it.
Starting point is 00:14:11 Usually takes a couple hours pretty much every work day. But yeah, you just have to be really diligent. Tell us about Xcon. Is that a byproduct of Fog, or did it predate Fog? Sure. So Xcon is the HTTP library that underlies Fog. It came about actually while I was working on Fog itself. I was somewhat dissatisfied with the interfaces
Starting point is 00:14:36 to some of the existing HTTP libraries. Figuring out how to use NetHTTP always meant me referring to the docs. I can never actually remember, and there's four different ways you can do it. It's not clear if some are better than others. And for the use case that I wanted, which was most of the time if you're working with a cloud service,
Starting point is 00:14:53 you're probably going to connect to it, and you're probably going to make several requests. It's unlikely that you're going to just connect and do one thing and be done. You're probably going to spin up a server and maybe attach a volume and so on and so forth. It's going to be a few things. So I wanted to be a server and maybe attach a volume and so on and so forth. It's going to be a few things.
Starting point is 00:15:06 So I wanted to be able to take advantage of keep-alive connections wherever I could. Which is also, if it was hard to figure out how to do requests in the first place, it's extra hard to figure out how to keep that connection open after the request is done to make sure that you can take advantage of that. So initially, Xcon actually ended up being inside of Fog itself.
Starting point is 00:15:25 There was just a Fog slash HTTP file, basically, that encapsulated all of that. And over time, I started to realize, granted, the scope of Fog is already kind of ridiculous, but having an HTTP library inside of it is kind of beyond ridiculous. This is just not OK. So at that point, I split it out. And it's actually been really nice because there have been
Starting point is 00:15:48 a number of bugs and other things that have been fixed by virtue of the fact that it's clearly an HTTP library that's off on its own that maybe would have remained indefinitely had it just stayed kind of at the low level, hidden behind the scenes in Fog. Up to 15 services across storage, compute, DNS. Talk a bit about the state of the cloud. Are we emerging with standards in storage APIs,
Starting point is 00:16:13 or is S3 one of the day? It's a difficult question. It seemed like S3 was definitely a front-runner, to say the very least. I mean, a lot of new services that were coming out were just saying, kind of punting and saying, we're just going to offer an S3 compliant API. Unfortunately, in my experience, compliant APIs, I don't even know what that means.
Starting point is 00:16:37 I'm not sure that the people that say it know what it means. For instance, the Google Storage API is ostensibly S3 compliant, but it's compliant to the version of S3 that was available when they released it. Which I'm not sure that they say what version that is, but it's drifted away from that. And then there are other things like the Rackspace
Starting point is 00:16:59 Storage, which that has obviously strongly influenced the OpenStack implementation of storage. But those haven't gotten really any adoption outside of Rackspace itself, so it's not really clear where that will go. I don't know. S3, I think, did a pretty good job in a lot of ways. I don't necessarily like the global namespace
Starting point is 00:17:18 of all of the buckets have to be in the same namespace kind of thing, but mostly it seems to work really well. So did it win the day? I don't know. There are probably things that could be done better, but they have such a front runner role at this point that I don't know that anybody's going to overtake them. What's your take on OpenStack?
Starting point is 00:17:35 Is it truly commodity, or is it at least common denominator? It's tricky. There are a lot of chefs in the- No pun intended? No, no pun intended. There's a lot of cooks in the kitchen. That's what I intended to say. The chef thing was sort of terrible, like a quadruple entendre or something. A lot of cooks in the kitchen. That concerns me.
Starting point is 00:17:58 I mean, a couple of primary players are NASA and Rackspace. NASA wants a supercomputing platform. Rackspace wants a public cloud offering. It's hard to think that that won't mean that either one of them loses or that it is lowest common denominator because those are two pretty different use cases. And I also just, I'm not sure how exciting it is
Starting point is 00:18:20 because, I mean, the analogy I was using earlier when I was discussing this with somebody was it's kind of like somebody open sourcing the plans for a nuclear power plant, right? That's pretty cool, right? But I'm not going to go build a nuclear power plant. I'm not interested in getting into the utility business.
Starting point is 00:18:37 There's a lot of overhead to getting into the utility business. Even if I did get into it, it's likely that if the other utilities wanted to, they could crush me on price because they just have the scale to be able to do that. I'm not sure that it's going to really invite other people into the market as much as a lot of us might like for it to. So I'm not sure. I worry that it's kind of a marketing effort more than necessarily a technology one. I've been interviewed. You hit a cat's a couple of times,
Starting point is 00:19:06 and he said himself that he builds more frameworks than he builds apps on top of those frameworks, right? What are you building in the cloud when you're not building libraries to consume it? Right now, not very much, unfortunately. There's been a few times where I've kind of made small forays. The most recent was I've been very interested in React, so I was writing some stuff to just play with React and use it.
Starting point is 00:19:27 And it was pretty fun because now that I kind of have the fog in my toolbox, I could pull that out. And then in a couple hours, I'd written a script that I could basically run a command where I said, I want to have a React cluster on Rackspace that has this many nodes in it, and you could just see it say, OK, this node is coming up. All right, it has joined the cluster. This node is coming up.
Starting point is 00:19:47 It has joined the cluster. And then it would say, all right, here's the list of IPs in your ring. And then you could just connect to any of them and push and pull data and that sort of thing. So I think it's very exciting. But unfortunately, I keep searching for what the use case is that's going to be really compelling for me and
Starting point is 00:20:02 then end up getting bogged down in all of the particulars. And I still have this problem similar to what ended up being fogged to everyone else's benefit, and perhaps mine, sometimes I'm not even sure, of starting to work on a problem and ending up in these huge rabbit holes of basically what Yehuda is saying. I end up working on a framework related to the problem that I was actually
Starting point is 00:20:26 trying to solve in the first place and maybe never actually get around to solving the problem. It's nice, I guess, when you can have the luxury of doing that or when the rabbit hole is interesting enough that you can get that lost in it.
Starting point is 00:20:41 Incredible lineup here at Red Dirt RubyConf in Oklahoma City. Of all that you've seen today, what's got you the most excited that you want to go play with? It's a tricky question. I mean, the deck was kind of loaded for me, I guess, because of some of the stuff that I've already been looking at. The cloud
Starting point is 00:20:57 question for me, I don't have an answer right now, but I've been exploring a lot to look at, well, beyond just React, doing some stuff with Backbone and Backbone.js and maybe driving that with CoffeeScript instead of JavaScript. And there's a lot of stuff like that where I don't know that I've really pinned down exactly what I'm gonna do with it, but I've been doing low-level backend stuff for so long that I just wanna do something to make sure that I still have my chops.
Starting point is 00:21:26 I used to do a lot of web stuff and I just haven't for six or seven or eight months because I've been so deeply doing the Fog thing that I just need to get back on the horse, I think. Thanks for joining us. We surely appreciate it. If you use the Fog, Jim, be sure and buy
Starting point is 00:21:42 the Sky Beer. Thanks. Thank you.

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.