The Changelog: Software Development, Open Source - Git, Showoff, XBox Kinect (Interview)

Episode Date: February 22, 2011

Kenneth and Wynn caught up with GitHubber Scott Chacon to talk about Git, distributed version control, and his quest to kill Word as a book authoring tool....

Transcript
Discussion (0)
Starting point is 00:00:00 Welcome to the Changelog episode 0.4.9. I'm Adam Stachowiak. And I'm Winn Netherland. This is the Changelog. We cover what's fresh and new in open source. If you found us on iTunes, we're also on the web at thechangelog.com. We're also up on GitHub. Head to github.com slash explore. You'll find some trending repos, some feature repos from our blog, as well as the audio podcasts. And if you're on Twitter, follow Changelog Show, Changelog Jobs, and me, Adam Stack. And I'm Penguin, P-E-N-G-W-I-N-N.
Starting point is 00:00:41 This episode is sponsored by GitHub Jobs. Head to thechangelog.com slash jobs to get started. If you'd like us to feature your job on this show, select advertise on the changelog when posting your job, and we will take care of the rest. First up this week, a great organization, Recruit Military, is looking for a Rails 3 dev. Familiar with RSpec 2, Cucumber, Sunspot Solar, Rescue,
Starting point is 00:01:02 Chef, jQuery, Backbone.js, a number of technologies here. Such a great organization that helps find jobs for servicemen and women returning from overseas service. If you're interested, lg.gd slash 7yankee. If you're a Houston-based Ruby and Rails developer, the fresh revolutionary marketing agency Media3 Creative is looking to talk with you. Actually, it's me who's wanting to talk with you. I joined Media3 Creative is looking to talk with you. Actually, it's me who's wanting to talk with you. I joined Media3 Creative a few weeks back, and I'm currently building an awesome dev team to work with,
Starting point is 00:01:31 so check out lg.gd.com or email me at careers at media3creative.com. And if you live to code where the user meets the app on the front side and you're open source friendly, like implementing interfaces in iOS, Android, web, and more, be sure and look up Austin Base, the front side and you're open source friendly. You like implementing interfaces in iOS, Android, web, and more. Be sure and look up Austin Base, the front side, but you can work anywhere, I understand. Shortcode lg.gd slash 8uniform. Fun show this week.
Starting point is 00:01:55 We talked to Scott Chacon over at GitHub about Git and ShowOff and even a little Xbox Kinect. It's quite the range, huh? It is quite the range. What was the perspective in terms of what we talked about? As far as Git? Yeah, like, was it a lot of Git? Was it a little bit of Git?
Starting point is 00:02:13 It was probably 90% Git, and not so much GitHub this time, which was a good mix to talk about how Git compares to Mercurial and some other distributed source control systems and how Scott kind of sells it to other communities that aren't as entrenched in Git as perhaps the Ruby community is and kind of the heritage that you and I come from and how he sells enterprises on the need to get off tools like Subversion and into a truly distributed source control system.
Starting point is 00:02:41 He's been a really good guy in terms of promoting Git over the past few years. Absolutely. I think he's taught a lot of us what we know about the tool. Also talked about his show-off presentation app that he hopes to be a keynote killer, where you write your presentations in web technologies. And then also one of his hobby projects, Connect2B, which is Ruby bindings for libfreeconnect that allows you to control the Xbox Kinect on your Xbox 360 console from Ruby. Really, really cool. A quick word on Red Dirt RubyConf taking place in Oklahoma City on April 21st and 22nd.
Starting point is 00:03:18 We'll be doing a live episode of the changelog at the end of day one, and Wynn's going to talk about day two. Day two is action-packed, full of training for some experts. Ryan Smith from Heroku and Wesley Berry from the Fog Gym will be doing some cloud training. Don't miss Optiva doing some
Starting point is 00:03:34 JRuby training, as well as I will be participating in some Titanium Mobile training with the guys at AppCelerator, Marshall and Kevin over there. But probably the most important part, catch our bud Eric Michaels over, and also Nick Quaranto from GemCutter doing some open source training. So there's a whole track on how to contribute to open source.
Starting point is 00:03:55 So be sure and catch Red Dirt RubyConf coming up in April in Oklahoma City, April 21st and 22nd. And registration is open right now, so head to reddirtrubyconf.com. It was a fun episode. Should we get to it? Let's do it. Chatting today with Scott Chacon from GitHub. So Scott, I think a lot of the listeners probably know who you are, but for those that don't, why don't you introduce yourself and your role at GitHub? Yeah, yeah. My name is Scott Chacon. I've been working for GitHub since almost the beginning. I started contracting with them when it was still sort of a side project for Tom and Chris
Starting point is 00:04:37 and PJ. And then we all sort of quit our jobs at the same time and started working full time about two and a half years ago. So yeah, so I've been working at GitHub for a while, and I do a lot of Git documentation stuff. I'm not very good at C, so to contribute to the Git project, it tends to be more writing and teaching and that sort of thing. So I do a lot of training for GitHub and doing a lot of conference talks. And I wrote a book called Pro Git that was published by A-Press under a Creative Commons license that you can get online and peep code PDF, that sort of thing. So I like Git a lot, sort of weird things you can do with it and teaching it and getting people interested in it. So that's me. So how did you come to, to discover Git?
Starting point is 00:05:27 Um, so at my previous job, I worked at a company called Reactrix, uh, which is now, uh, has gone out of business, but, uh, well, it went out of business a week after I left. So I like to think that I, you know, a business cannot sustain itself without me, its payroll. So I sort of took it down. But when I left, or when I started there, we were trying to do content distribution for things, for these devices. And so we were using, we would just create an RPM of the software
Starting point is 00:06:00 and we'd, you know, SCP it over. And that was very, if we change one file, we'd have to create a whole new thing. And so there was no incremental transfer. It was very, uh, it was very difficult to do, wasted a lot of time and, and, and bandwidth and stuff like that. So what we wanted to do was something like, um, our sync. And we found out that get was actually a really good sort of our sync for what we were trying to do. So we would put everything in Git, and we would create these custom trees of just the content that was needed out of the system, and then have the client fetch it.
Starting point is 00:06:32 And we didn't actually have to have, even though we had hundreds of clients that all had to have different combinations of that content, we didn't have to have hundreds of subdirectories with just the content that each one needed so we could RCP just what it needed. What we would do is we'd do it artificially in Git using sort of the index and just say, okay, just these five directories and not everything else and commit it. And it would never actually exist on disk, but we could have the client fetch it and it would come over and then be on disk on the client. So it was awesome. It actually worked really well as sort of this strange R-Sync sort of replacement. So that's how I started using it.
Starting point is 00:07:09 So everything that I started using was, we were actually using Perforce as the RCS type thing at the time for the software. And we were just using Git to do this R-Sync stuff. So I was sort of fascinated with the system. And as it became used more, I found that I knew a lot of the sort of underlying plumbing stuff, and not very many people did. And I really enjoyed it. I thought it was a really cool model. So I wrote the PeepCode PDF first, and that's sort of how I got into being the Git guy. I also went to meetups here in San Francisco, which is how I met the other guys, is how I met Tom and Chris and PJ. And basically every week that I would come, I'd demonstrate
Starting point is 00:07:50 some other language that I'd partially implemented Git in because, you know, it doesn't have a linkable library or it does now, but it didn't at the time. And so I would be like, hey, look, I've re-implemented the blob writing and reading mechanism of Git in ActionScript. And people are like, why would you do that? That makes basically no sense. And so it was basically every single week. It's like I did it in Ruby. I did it in ActionScript.
Starting point is 00:08:14 I did it in some other language and Erlang, whatever. And so people were – I think that's sort of how I got the reputation as the Git guy was just I was obsessed with it at all times and still kind of am actually. So as someone that's recently went back to the corporate scene, a lot of times I'm having to sell folks on why they should throw away Subversion and move to distributed source control. So as the guy that wrote the book on Git literally, what do you tell people when they are considering a distributed system? Well, you know, I tell them that it's faster. I tell them that their developers can work better. I mean, it depends on who you're asking, right? If you're asking a developer, if you're asking somebody that's, you know, making the purchasing decisions or something, but, but, you know, having more efficiency for your developers, they can work offline,
Starting point is 00:09:03 they can work off VPN. All of the commands are faster. Branching and merging is easy to do. And it's a very common operation, which is not common in basically almost any other version control system, especially ones that people are switching from. But the offline stuff, I've been to places where, you know, they have ClearCase or they have Perforce or Subversion or something, and their system goes, especially like ClearCase, their system, their main server goes down or their network goes down for a little while and basically everybody has to stop working completely, right? And it's not as bad in Subversion where you can at least keep coding even though you can't commit and stuff, but almost everybody's been bitten by that, you know, or they lose the
Starting point is 00:09:42 database and they have to recover it or something and you you tell them you can do everything offline in Git. Everybody that's working on the project has a full backup of the system. There's no single point of failure. It's easy to, if the server goes down, to put up another one. Everybody can keep working off that, branching and merging. One of the big ones that I see people light up when I explain it to them is the continuous reintegration. You can continuously reintegrate branches in Git,
Starting point is 00:10:05 and that's very difficult to do in most other systems, especially Subversion, even with the merge tickets that they have recently. You can create a branch for changing your database backend or adding translations to your system or something that takes a long time and generally would be this merge hell that everybody would have to go through. And you can just be on that branch and continuously reintegrate the master branch into it very easily. And at the very end, just switch back and do a fast forward merge from master to whatever the branch is and get all of that stuff. And if you're merging every day, you only get, you know, 24 hours worth of merge conflicts at a time and not, you know, this huge.
Starting point is 00:10:43 There are 50 files that have conflicted. It's, you know, if you're good about it, it's impossible to do that. the merge conflicts at a time and not this huge, there are 50 files that have conflicted. If you're good about it, it's impossible to do that. So when I demonstrate that sort of stuff, that's when people really embrace it. And I think that's how most of us got really interested in the Ruby community about it, which sort of embraced it early and fast, is we would do demos in the conference, in sort of the side rooms of the conferences, saying, look how cool this is to create branches and switch back and forth between them real fast and merge them back and forth.
Starting point is 00:11:10 And it was so ridiculously easy when you're actually watching that, that you can't not see how that would be good for your team and good for your development practices. So as you've been going through this training with all these corporate clients and everything, have you found it really difficult to sell the concept of Git for the people who are really fond of having a really federated system where no one can touch their code unless they're authenticated via their exchange server and systems like that? Well, not really. I mean, and that could just be because of the clients that I'm doing, right?
Starting point is 00:11:43 We're not, GitHub is not sort of selling into corporations and saying, you should be using this thing. We don't have salespeople that go out and do stuff. It's always generated from within. It's from developers that are using Git for open source projects on GitHub or something, and then they want to use GitHub internally. So they look for GitHub Enterprise, like our firewall install client, where you can buy it and run it inside your firewall. And so they come to us for that because they want to do that.
Starting point is 00:12:12 And then they say, as long as you're doing that, you want to come and do some training as well. And so we'll either throw that in with that. So they've already embraced it in some way. Or the other one that I do a lot is large corporations that do sort of Android development, so like big telecom type companies. And so they want to be involved in the Android ecosystem, and it's using Git, so they sort of have to use Git. It's very rare that we won't really go in and schedule a meeting and go in and sell people on the merits of Git. It's more of a developers love it, and they use it in their off time, and then they try and get it into their company because there's a need for it, right? It's painful to use another version control system if you're using Git in your spare time. That's true. You know, one of the
Starting point is 00:12:55 ideas that I had last week was, you know, these adopt the highway sections of a highway where they have a local group that goes out and just picks up trash on the highway or whatever. I think we should all go out to Google code and some of these other places and just adopt a repo that's in Subversion and just pull it over to GitHub and mirror it. Yeah, I mean, we've tried to make it kind of easy to do Subversion imports, but the problem with Subversion importing is that, you know, changing from any version control system to another is that it depends on the history of the system, right? Like, really simple ones aren't that difficult, but, um, I've been to a bunch of companies
Starting point is 00:13:26 that, you know, have these really complex histories where they even moved from CVS to subversion and then they've been in subversion for years and they have hundreds of thousands of commits and, you know, they don't know how they want to split it up or, you know, they've added a large file and then removed it again. And so that import, you know, adds the big file into your clone and stuff. And so a lot of times that, adds the big file into your clone and stuff. And so a lot of times that has to be sort of custom. I've seen people write custom, you know, importers with like Git fast import, which is, you know, an incredibly time consuming process. But,
Starting point is 00:13:54 and then I've seen other companies where they just take the last snapshot and put it into Git and they're like, screw everything else, let's just go. So it's so highly dependent on the team around whatever the project is you know yeah whenever i have to do that i usually just do get us the end clone and it'll if it's a big tree it'll take hours and hours and hours it's terrible so so how pimped out is your is your git config file um it's not it's actually not um largely because i do so much uh training and evangelization and stuff, I don't want to have a very custom setup locally
Starting point is 00:14:32 where I'm typing commands that they can't type or something if I'm trying to demo something. So for a long time I had no Git aliases. I'd have to type out everything all the time. And I had no Bash aliases. So you know, it wouldn't actually just most of the people that get up can type like, you know, GCI or something, and it doesn't get commit with with options and things. But I try and stay away from that so that I can just use you know, I have sort of the experience this sort of new user experience
Starting point is 00:14:59 still and I can teach that you know, I remember what all the commands are. I have weakened in my resolve recently or, you know, within the last year or so. I added a git lol, which does a git log graph decorate one line. And then, you know, so it gives me a nice sort of visual graph so I don't have to use git k. And git st, which I use for git status dash S dash B, which gives you a short status in the newer versions of get sort of like the, the subversion looking output, which is like question marks next to each name that's, you know, on track and things like that. Um, and that's a lot nicer looking than the sort of verbose get status output. So those are my two cheats, but other than that, um, and I, I think I put a custom font
Starting point is 00:15:42 in for get gooey and get K. Um, but other than that, I don't really have very much in there because I don't want to cheat. So do you use a lot of external tools with Git, like TIG or GitX at all? No, I don't. I use Git GUI every once in a while, which is sort of the committing interface for Git on a GUI. If I have a whole bunch of stuff that I've done and I want to break it up into three or four commits and be really specific about it, because you can do line-level commits, sort of patch, like git add-p, you can do that, but you can do it on a line-by-line basis, which is a little bit nicer, so you can sort of go through that real fast.
Starting point is 00:16:21 A couple of guys at GitHub use gitx to do the same thing, which has a really nice interface for that as well. But again, I try and use whatever comes with Git when I can so that I can teach it a little bit more broadly. So why Git and not Mercurial? Why Git and not Mercurial? So I have done a little bit of work in mercurial um i did a plug-in for mercurial called hd git which allows you to uh i started it which allows you to uh commit uh in mercurial and then push to a git server so you can use mercurial and then push to github for example to to put the code on and then people don't you know necessarily it uses git as the transport port mechanism so github doesn't know that you're using Mercurial Client for it and
Starting point is 00:17:07 it's a one-to-one conversion ratio. Every object in Git has sort of a... or every commit in Git basically has a one-to-one relationship with a commit in Mercurial. And they're very similar. So when I was writing that, I had to learn
Starting point is 00:17:23 sort of the back-end systems. How does it store its data? What does the actual sort of format look like? How does it think about the data that you're putting into it? And it turns out that it's actually incredibly similar. The main difference is how it actually stores it on disk. It's not the objects themselves. The objects themselves are actually incredibly similar, and it's not that difficult to go back and forth between them, which is why branching and merging is just about as easy in Mercurial. A lot of stuff is the same.
Starting point is 00:17:49 So what I like to do is say, use whatever client you feel more comfortable with. I feel more comfortable with Git because I like the branching model better. But recently, Mercurial has bookmarks, which are very similar to the Git branching model. So if you want to use bookmarks, then you get sort of the same thing. But other than that, they're incredibly similar systems. And so, you know, the HG Git plugin is a nice thing because then we can say, use whatever client you want, use Mercurial, use Git, push to Git, have everybody can work together and nobody really needs to know that other people are using, you know, whatever client they're most comfortable with. But yeah, I mean, I used Git because the backend system,
Starting point is 00:18:27 originally, like I was saying, I was using it in a more low-level way. And the backend system gives you a lot more power in Git. It's a lot simpler. The Mercurial one is much more complex. It's sort of a hybrid between the Subversion model and the Git model, where the Git model is just, here's all these objects in a database. It's sort of a key value store, don't care. And Subversion has this file-based log system where you have versions of each file in
Starting point is 00:18:48 a name of that file or file named after that file name. And in Mercurial, it's sort of like that. Like you have for every file you've ever had in your system, you have this file.i. It has a log of every version of that file. And so it's a lot more like if you rename a file or remove a file, you still have to have that log there and you have to have rename links and all this stuff. It's much more complex and gets super simple. It's just here's a manifest and a commit, and here's all the objects, and we don't really care. We don't track renames. We figure it out after the fact.
Starting point is 00:19:25 So that worked for what I was trying to do with the low-level stuff. And Mercurial, it's basically just a version control system, whereas Git, you can use the backend for basically anything that you can think about using for a version POSIX file system, because that's basically all that it is. I really wish that there was a GitHG plugin, personally, because I have been involved in the Python community every once in a while. I find someone real stubborn who's working on
Starting point is 00:19:41 Bitbucket, and I have to push up to it, and it's frustrating. But I actually just watched a talk by you recently where I didn't realize that when you're using HG Git, it actually has a full Git repo inside of it and they can just clone off of the bare repo in there and then work with that. Yeah, yeah. There are a couple of people that are using HG Git to do the opposite. So HG Git's Augie Fack and and uh a couple other people have sort of taken that over and and got made it a lot better than it originally was when i was working
Starting point is 00:20:11 on it i i kind of haven't been working on it for a while but it's great and a lot of people use it now um and he made it really really fast a lot faster than it was when i was doing it but um but yeah so a lot of people will use it where they'll use it sort of the opposite, where they'll use it to take their Git stuff and put it into Mercurial and push it just via the normal Mercurial thing, because it does bidirectional conversions. When you clone from Git, I have to turn them all into Mercurial object, and when you commit in Mercurial, I have to turn them into Git object.
Starting point is 00:20:43 So it can do both ways. And it's not that difficult to set it up the other way, but it's not built in, it's not super easy like it is with the Mercurial side of it, right? Sounds like a good contributor could add that, right? Yeah, I would look at it. I don't remember. There's, I think, a couple of people
Starting point is 00:21:02 have added some things to it to make it relatively easy to do that sort of thing, but it certainly doesn't ship with Git, right? So we cover quite a broad range of listeners. Do you want to go over some of the basic differences? I've heard you talk of this before, of why Git is being used by who it's being used by and Mercurial at the same time.
Starting point is 00:21:24 Not that one is superior to the other in any way, that they're actually quite similar, and then why one's becoming more popular than the other. I guess one of the, I kind of considered a mistake at this point, but I'm too lazy to go back and redo it, is I made a website called whygitisbetterthanx.com, and I put a bunch of other version control systems and basically just summarized for people that are saying, why are you using Git? Especially, you know, a couple of years ago, people are like, why are you using Git? And so I wanted to summarize, this is why we chose Git. This is why people that use Git chose Git. But the problem that I did was I put a lot of different
Starting point is 00:21:58 version control systems on there. I had it comparing to, to Mercurial and to Bazaar, which are other distributed version control systems. And since then, um, you know, all the email I've gotten back is not defending subversion or Perforce or the other ones that I compared it to. They're all defending Mercurial and, and sometimes Bazaar. Um, and, uh, and so I've, I've sort of changed my message to, to be, you know, we don't care what you use. Everybody should be using distributed version control because there's still a huge, huge, you know, population, especially in the corporate world that's using subversion for, or centralized version control systems for stuff. And I think in most of those cases, it would do, you know, it would be better for the entire development team
Starting point is 00:22:42 there if they were using a distributed version control system. And the reason why, largely, besides just offline work and stuff like that, is branching and merging. If people are using CVS or subversion or any of the RCS derivatives, any of the centralized version control systems,
Starting point is 00:23:00 I guess I should say, they have a different mentality of how to develop, right? And if you're using a distributed version control system, because you can sort of craft your commits, you can think about it a little bit more, you can do stuff offline, you can decide when to push and share with people, you can do branching and merging very easily. And it's very lightweight. It's something where you say, I'm going to make a branch for every ticket that I'm working on or something. Somebody in a centralized version control system, like if you're doing subversion, that would make no sense, basically.
Starting point is 00:23:30 It would be so much overhead that it wouldn't be practical. And in Git and in Bazaar and in Mercurial and distributed version control systems, those are sort of the top three. That makes sense. And so I want everybody to be doing that. I want to, I want the mentality of the entire development community to be, you branch first, you do stuff in branches, you merge it in when it's ready. And when we can get people from Subversion over to any distributed version control system, that mentality changes. And I, I did not have a hard time working in Mercurial, right? I mean, I, when I was doing it,
Starting point is 00:24:03 when I was writing HG Git plug and I did everything in Mercurial and, and I mean, when I was doing it, when I was writing HG Gitplug, and I did everything in Mercurial, and I thought it was fascinating, but it was not difficult to do. It was not nearly as difficult as moving from Subversion to Git or from, you know, what was I using before that? RCS, I guess, or, you know, from Subversion to Perforce or something that's sort of really different, right? I think once everybody's in that mentality of what they expect their version control system tool set to be and how they expect to work and the efficiencies that they expect to get out of developing and how they expect to collaborate, right? It doesn't really matter which of the three it is because that's your mentality and you just have to remap that onto something slightly different.
Starting point is 00:24:43 A user interface is slightly different, right? So that's been mentality and you just have to remap that onto something slightly different. A user interface is slightly different. Right. Um, so that's, that's been the push is not alienated people that are, that are Mercurial users because I, you know, I actually like Mercurial to a certain degree. There's a lot of interesting, uh, development decisions that I think were made smarter than get and a lot of ones that I think were not. And, and I actually, I love, you know, drinking with people and talking about that for a really long time because I can, it at least to a fair depth because I've been doing both of these.
Starting point is 00:25:10 I actually had a lunch one time with Augie, who's the guy that has been maintaining the HD Git plugin, who's a Mercurial hacker and has always been a Mercurial guy, and me and another friend of ours. And the three of us basically spent the entire time drinking and eating pizza and talking about the differences in the transport protocols between Git and Mercurial. And I was like, this is a conversation that can only possibly be interesting to basically the three of us on the planet. But I love it. I think it's really interesting. But the point is that once you get into distributed version control, I think that that is the future of development. And the sooner that we get more developers over there, I mean, the better it is for obviously GitHub, if people are using, you know, even if people are using Mercurial, that's better than people using Subversion, right? They're closer to using GitHub, or to being involved in an open source community that embraces that development style.
Starting point is 00:26:02 And that I feel like that's better for the open source community in general, right? Getting off of this. I mean, that's the other thing. Sorry, I'm sort of ricocheting here. But that's the other interesting thing is that, you know, I mean, our interests are more than just GitHub. Our interests are the entire open source community. We want the open source community to be vibrant and to be interesting because that's who we all are. That's who basically everybody at GitHub came from, right, and how we met each other. And so we want the open source community to thrive. And I feel like distributed version control systems, it's much easier to thrive as an open source community using that. When you're on Subversion, everybody has this sort of read-only thing and you can read it and you can improve it locally, and if you want to go through everything, you can extract a patch and mail it to a thing and go through that whole thing, and it's really heavyweight.
Starting point is 00:26:54 And then if you do that enough times, maybe they'll give you a commit bit, and then you can actually push stuff into the repository. You can actually commit something to the repository, and everything's so heavy, right? It's so difficult to get involved. And in Git, I feel like, or even in Mercurial, but in any of these distributed version control systems, because you can have these, you know, these sites like GitHub, where you can create a fork and have your own write permissions and share stuff without having to get the sort of blessing of the entire community and craft stuff that's nice and send it back. And everything's very easy, right? I mean, that whole process is so much easier and you have so much more power in it doing that. And if you don't get it back in, you can still keep it up
Starting point is 00:27:33 there, right? It's easier for everybody. So that's, I feel, a little bit less true in Mercurial than in Git. I think that it's easier to do forks and stuff in GitHub than trying to do like patch queues or something in Mercurial. But it's certainly easier than in Subversion, right? So that's the other push, is we want the open source community to be on one of them so that it's easier for us to collaborate and open source grows faster. Absolutely.
Starting point is 00:28:01 I think the big thing that GitHub did when they decided to build the system that you guys have is to take the projects and make the name, the namespace that everyone shares, your username rather than the project itself. So there's no GitHub slash whatever the project name is. It's username slash project name. And that's what really enables people to be able to make it. It turns it from being a technological problem to a social problem. Correct? And the other nice thing about that is that you don't have squatters, right?
Starting point is 00:28:29 Because you have your own namespace. You don't need to, you know, try and squat a name that you want like you do on even like in the Ruby community, the way that you get gems out is with Gemcutter, right? And how you used to do it was Ruby Forge. And so if you want a gem name, if you want a project name, you have to sort of squat it while you're working on it, unless you sort of work on it in private and put it up there or put up something that isn't really quite ready yet or something. But there's still a little bit of squatting, but it wasn't as bad as like Source Forge or something where, you know, half of the projects are dead because just because they thought of a cool name, they're like, you know, they're like, backscatter, that sounds amazing, let's do a thing.
Starting point is 00:29:10 And then you put it up there, you know, this is what this will be. And then like 90% of the time it never happens, right? And in GitHub, maybe you create a project name, but you don't have to really squat it. You're not taking it from somebody else that could do something cool with it, right? You know, that's so true. On RubyGems, it cracks me up to 404 pages, page not found, but then it says, it will be mine.
Starting point is 00:29:28 Oh, yes, it will be mine. Sounds like me in domain name purchases. Yeah, well, that's a whole other thing that gets me angry, too. So the whole Git ecosystem right now, is there anything that really gets you excited, like the development of libgit2 and other projects like that? Yeah, well, I mean, the development of libgit2 certainly gets me excited because I'm sort of directly involved in it. But it's something that the Git community has needed for a long time, is a linkable Git library. Because the library, there is a libgit.a that is produced by building Git itself, but it's not reentrant. So
Starting point is 00:30:06 if you link to it and it gets to a certain point and it does this all over the place, it was built as sort of a command line tool. So it'll just call die. And so your program, whatever it is, will simply die if it gets to that point. And so you can't really use, there's no stable defined API that won't change. Everything just sort of changes all over the place. It's, it's sort of a mess. Um, and the tool is great and there's a ton of really smart people working on it, but, um, but there's no linkable library. So you can't really build like a GUI on top of it, which is why they were slower to come. Um, and, uh, and so the LibGit2, which is the linkable library that's reentrant and has a stable API and all that stuff, has been in the works for years ever since. I think it sort of started when I went to one of the get-togethers every year after the Google Summer of Code conference.
Starting point is 00:30:57 A lot of the Git people are around, so we do a get-together where all the Git developers get together and talk about stuff. Um, and I was showing, I was basically showing people all of these different implementations that I was talking about that I had done of get in all these languages, like, you know, in Ruby and, and I helped with some of the Python stuff, I think.
Starting point is 00:31:15 And, um, and I did one in Erlang or possibly two in Erlang. And I did, uh, one in action script. I was showing all this stuff and it's like, this is necessary because there is no linkable library,
Starting point is 00:31:24 right? Um, otherwise we could be building showing all this stuff, and I was like, this is necessary because there is no linkable library, right? Otherwise, we could be building wrappers and nefs and stuff. And so the project sort of started, but it never really went anywhere. And then last year, for the Google Summer of Code, somebody put up a thing that they would be interested in working on it. And I became the mentor sort of by default. I wasn't really planning on doing it, but him and I worked together and then he got really, really far with it. I had a great student, Weiss, and he got really, really far with it and it became really usable. And so GitHub decided to just keep paying him basically to keep working on it. So it's sort
Starting point is 00:32:00 of the indefinite Google summer of code where we took Google out and then replaced it with GitHub. And then, you know, he's still a student where we took Google out and then replaced it with GitHub. And then, you know, he's still a student and we keep paying him to work on it. And we've gotten a couple other people. Jeff King from the Git community is a really huge Git developer. He's sort of, you know, partially working on it as well. So now GitHub is sort of driving the development
Starting point is 00:32:19 of this LibGit2, you know, library where we can use it in stuff that we, you know, on our backend and stuff, which would be really nice, um, uh, for us, we're, we're doing a Ruby wrapper for it as well. Um, and we're getting contributions like a Python wrapper and a, um, uh, dot net wrapper and an objective C wrapper and stuff. So you can use it from all these different languages, which is, you know, sort of historically been another thing that's nice about Mercurial is that you can write tools and stuff where it has this nice API and you can sort of extend it. And now I think LibGit2 is getting, it's almost far enough along
Starting point is 00:32:52 where, you know, we'll have wrappers where it's just as easy to write something in Python using Git as it is using, you know, Mercurial, even though Mercurial is written in Python. But then you could also do it in Ruby or in shell scripts or in Objective-C or in whatever language you like, right? I mean, we have like Lua wrappers or something for it. So that's what I really want to get to, where there's these nice APIs in almost every language on this nice, fast, stable, reentrant, thread-safe library. So that's one of the things I'm really interested in, is not just the development, because I'm not great at C. I can't really do the code. In fact, the way that I was
Starting point is 00:33:30 doing this Google Summer Code stuff was I would define what I wanted the API to look like in Ruby, basically. He would write it all in C. I would look at the.h files, write the wrappers in Ruby, and then write the unit tests in Ruby to see if the stuff that he wrote in C worked or not, which is possibly not the best way to be doing that, but it was a lot better than me actually trying to look at his C code. And that's largely kind of how we still do stuff is I make sure the rugged Ruby wrapper works for stuff and that I can build the things I want to. But then evangelizing that and saying, you know, when I go to companies or when I go to talks or something, saying, here's this cool library with all these bindings, write something cool with it because the backend is incredibly flexible, right?
Starting point is 00:34:11 It's basically just this key value store and this, you know, sort of linked list of snapshots of manifests of this file system. And you can do whatever you want that, you know, that syncs well and easily and incrementally. And so you can do anything you can think of that, that would use a structure like that you can do and get, it doesn't just have to be version control. Um, and so that's, that's what I'm, I'm, I'm really interested in the next couple of years because we're going to have, you know, LibGit 2 and all these nice bindings. So everybody can write all these cool scripts and stuff that do all this custom stuff but then also um e-git um you're you're asking about stuff i was excited about and get e-git um is the the eclipse get plugin and the eclipse project has has sort of embraced uh get as their next version control system basically from cvs they never really embraced
Starting point is 00:34:59 a version that well um and so they're all working on this, this EGIT plugin for, for Eclipse that's, that does, you know, everything where you don't have to install Git. You can simply install this plugin. It has a pure Java implementation of Git in it. And, and you can do everything in there. NetBeans has a great plugin now as well for their editor. So, you know, all this stuff is coming online. All these GUIs are starting to get written. Git Tower just went 1.0 yesterday, I think, which is a nice professional paid-for Mac app, Git GUI, that I've seen a lot of people using and liking a lot. So yeah, anyways, as all that stuff happens,
Starting point is 00:35:37 as all the GUIs get developed, and as these scripts get bindings that are fast and capable and have this nice API to them, I'm really excited to see what people are going to be doing with Git, right? So my job now is not so much doing sort of the proof-of-concept stuff, although I do do that a little bit with some things like large file support and things like that, but mostly telling people what's out there and then seeing what they do with it, you know?
Starting point is 00:36:03 So as GitHub has become more and more popular, a lot of your users aren't necessarily experienced with source control systems, and I've found that a large number of the more, you know, the beginners don't understand Git as a concept fully. It's just a natural thing that happens. Is there anything that you feel that the whole community really needs to take the time to learn, uh, in general that you can think would help them a lot, like, you know, learning what a rebase actually is and things like that? Um, I'm, I'm sort of split on that. I'm not really sure that, I mean, what I like to do is teach sort of basic concepts of what Git is trying to do. Because a lot of people, especially from the
Starting point is 00:36:45 developer community and some designers and stuff as well, have come from this aversion world where, I mean, the interesting thing about version control is that most people for a long time don't take it seriously. It's not taught in universities, really, which might be part of the problem. You know, I was never really taught version control when I was at university, and that was fairly recently. You know, I mean, I graduated in control when I was at university. And that was fairly recently, you know, I mean, you know, I graduated 2002. And I went to UCSD, and they, they didn't really teach version control anywhere. And it certainly wasn't, you know, it certainly wasn't sort of presented as a tool, right? They taught, they taught programming, they taught languages, they taught assembly, they taught, you know, all this stuff, but not version
Starting point is 00:37:21 control that was not really considered and even like editors and stuff, right? But that version control wasn't really considered a tool set that was important. And I think that's sort of gone through a lot of the industry is a lot of people don't, they see it as sort of a necessary evil, right? You have to have it so that you don't lose everything. Not this is a tool that can make you better at your job, right? Or can make your life easier as a developer. A lot of people don't see version control that way and it may be because it hasn't really been like that as much whereas i feel a git even though it's sort of complex i mean it is more complicated you can do very complicated
Starting point is 00:37:53 things but i think it's worth investing the time to learn it to get a book to read it i mean you know i i have stuff that's you know i've that that's free and. I've been trying to do a lot of evangelization for Git itself, but also, you know, writing stuff down so people can learn it as easy as possible. But I feel it's worth, like people think that, you know, subversion, they're just like, okay, here's the eight commands you need, and that's it. And they don't really learn it in depth, right? And they kind of want to approach Git the same way. And I feel like it's important to learn it, to say this is a tool set that is as important as learning an editor, right? As learning VI or learning Emacs or learning, you know, Eclipse or, you know, everybody spends hours and hours learning their editor. Nobody uses Notepad to do programming, right? And Subversion,
Starting point is 00:38:41 people use it like Notepad. They're just, okay, I'm at some point, I'm going to commit, and that's it. And I feel like there should be more, people should take it seriously as a tool set that gives them power, right? That gives them a lot of power. Like learning Emacs as a power user or something gives you a lot of power. Learning Git gives you a lot of power and that that should be a focus of of places to make sure that people and even of schools and stuff to make sure that that that people see that as as that tool and not as a necessary annoyance i guess um i mean i can do get in an hour and i do that a lot but but i like the ones where it's all day and i'm teaching a lot more stuff on how to think about version control and how to how to use it as a as a tool that makes you better at what you're actually trying to accomplish, right?
Starting point is 00:39:27 Collaborating with people, at looking through your history, at figuring out what happened, at peer reviewing code, at doing merges and working independently on different branches at the same time, that sort of thing. So, yeah, I mean, I think it's more of a mind shift
Starting point is 00:39:42 that people have to see the tool as a different class of tool than people used to think about version control, I guess. I'd like to switch gears for a moment and talk about another one of your projects, ShowOff. What's the inspiration behind this and what's the state of ShowOff? ShowOff is, I've been using it for almost all of my, it's a presentation tool. So the idea behind ShowOff is you write your slides, because I do a lot of talks. I do a lot of training. I do a lot of conference talks.
Starting point is 00:40:14 And so I make a lot of slideshows, basically. And, you know, I mean, a lot of people do. It's, you know, one of the word processing, slideshows, Excel spreadsheets. Like, those are sort of the big the big three that are in all the Office formats because everybody uses them for stuff. I was using slideshows a lot. I used Keynote for a really long time. It was not bad. It's fairly nice software, but there's a lot of things I couldn't do with it. One of them is version control. As I'm telling people to take version control more seriously, I try to make sure that all the stuff I'm doing is version controllable,
Starting point is 00:40:49 especially for like the training stuff. Because if you think about it, if you're doing training, you know, every couple of weeks or something, and it's variations on a theme where you have a whole bunch of different sections and some companies want some sections and some want others others and it changes slowly over time or you have to customize stuff for certain companies it's very nice to be able to branch and merge your presentation right and you can't do that with with keynote it's it's it's just not really possible to do it and and and be able to manage it properly and the other thing about presentations especially the way that i do them which is generally a couple of words on a slide, you know, I have a sort of bare presentation style, is it's just text, right? I don't have a
Starting point is 00:41:33 ton of animations. Most of the stuff in word processors and in presentation software is 95% of that stuff is never used by anybody, right? Even if they know it's there, just because I'll use it on one or two slides, maybe for animation or something like that. But generally you don't care. It's just words. It's, you know, examples, code, things like that. And so what I wanted to do was have everything in a basic text format.
Starting point is 00:41:58 So I chose Markdown. So you write everything in Markdown and then you run a thing and it creates HTML off of it. And then it's an HTML-powered presentation, right? But it's awesome because I can version control everything. I can have all the different subsections and subdirectories and then move them from slides to slides, or I have a little showoff.json sort of index file where I can remove lines in and out as I do and don't want different sections. And it just makes it as easy to write my presentations as it is to write code, right? And manage them and share them and have people fork them and fix them and send me pull requests and all of that stuff works, right? If it's plain
Starting point is 00:42:34 text, anything that you can do in plain text, I like doing in plain text if possible, right? And my presentations really didn't have that much that I couldn't do in plain text. It's just that there was no real tools to be able to do it very easily. There was like Slidey and S5 and stuff, and they didn't really fill my criteria of being simple and fast to write slides. And mine is just basic markdown, and it works great. And then the other cool thing is you can add custom JavaScript, custom CSS. You can use tools that are used for web development to
Starting point is 00:43:05 do custom things in your slideshows, right? So I put in, I do a lot of, you know, git commands. So I'll type a git command on the command line and then show the output. And it's very difficult to do in Keynote and then show off, you know, I just use a jQuery plugin that does typing. So it looks like I'm typing it at the time and then all output comes in after that. And I don't have to program any of that. I just have to put a style on the slide that says this is a code example. And it'll just type it out for me as I'm hitting the button. So like that sort of stuff. You can also do fun stuff like the other week, I've been playing with the Connect, you know, the Microsoft Connect device. I got one of those and I got the, there's open source drivers for them on GitHub.
Starting point is 00:43:46 And so I was playing with that on the Mac. And I made it so I could control the presentation with a Connect. So I used Firewaiter. You guys are familiar with Firewaiter. It's like a browser testing thing. So it'll click buttons in your browser, basically,
Starting point is 00:44:02 for you. So I just hooked that up. It's like Selenium. What's that? It's like Selenium. Yeah, it's Selenium, right. And so I made a really simple wrapper that just took input from the camera, cleaned it up, saw when I was doing left to right or right to left movements with my hand, and then hit buttons in the browser basically that made it go back and forth in the thing. So that would be difficult to do in Keynote, like to try try and send keynote a signal to go to the next slide programmatically.
Starting point is 00:44:28 And it's very easy to do because I'm using a browser, right? I mean, all of this stuff that already works for browsers, you can use with your presentation software. Um, so that that's what show off is a lot of people are using it because it's, it's fairly easy to do. It's easy to, you know, get up and running and version control. And you can say stuff like show off Heroku and it'll Heroku-ize it for you and you can push it to Heroku and then your presentation is on Heroku. Or now you can say show off GitHub and it'll create a GH pages branch. You can push it to GitHub and we'll serve it statically off of GitHub pages. So you can share the presentation easily and you can share the can share the presentation easily,
Starting point is 00:45:05 and you can share the source for the presentation easily, which is really nice. One of the first show-off presentations that I did, somebody did a presentation of show-off in show-off, and then I wanted to do one as well. I had a little lightning talk that I wanted to do for it. So instead of creating my own, I forked his, changed it a little bit to fit the format of the new presentation,
Starting point is 00:45:24 and then I gave a presentation on show-off using show-off using a forked version of show-off. So I was really happy with the metaness of basically that entire experience. The other cool thing about that is that, so that's show-off that you're asking about, but that's also an interest of mine is making tools for things that don't need huge, heavy, overdone GUIs for them. That's what the entire industry uses, right? So presentation software is an example. Word processing is another example. There's a lot of stuff that you need Word for, right, that does really complex stuff.
Starting point is 00:45:59 But there's a lot of stuff, most stuff, that you don't. So I wrote a book for A-Press, the pro-git book I wrote for A-Press. And basically the process for that was you write everything in Word. They give you a style sheet that the publishing tools know about. So you have to stay within these like eight styles, right? So already you have a lot of constraints on that, right? So it's not like you can do anything in Word. You can't really use Word as the full tool.
Starting point is 00:46:26 You can only use these 10 styles. And so I felt that that was really dumb because it's basically just a bunch of words, inserted images, and then everything's constrained within these 10 styles. Why are we not using Markdown, right? Or Showoff, or I mean, not Showoff, Markdown or Asciidoc
Starting point is 00:46:41 or some structured markup language that is very simple and does this thing simply, right? So I wrote my book in Markdown. I had to export everything from Markdown to Word for the copy editing phase and then export everything from Word back into Markdown to publish the website at the end, which was one of the most horrible experiences of my life generally. But what I'd really like to see is a tool chain for technical authors for writing books about open source projects, for writing just normal tech books like ProGit, like normal technical books that all of us read a couple times a year probably. I'd like that entire process to be much simpler.
Starting point is 00:47:27 It's because there's thousands of authors doing these books, and there should be more. There should be a small manual for every open source project, basically. I think it would be really helpful to have that, to have a Rails manual for every project. And it's not really done because the authors have to come up with all this stuff. They have to create a website for it. They have to create, you know, figure out how to generate a PDF or a Mobi file and EPUB file or, um, you know, all of the different publishing
Starting point is 00:47:52 standards. But if you want to read it on your Kindle or your iPad or something, right. Um, so that's one of the projects I'm working on right now is, is trying to do that for not just word processing, but like writing, writing books or writing manuals or writing novels or anything that doesn't take, it's not a children's book, right? Anything, anything that, that has text, a couple of styles and, uh, maybe some code examples or some math formulas and some images and that's it. Right. That's good.
Starting point is 00:48:19 What's that? Yeah. So that's good scribe. So I'm working on that right now. And I'm actually sort of in the process of possibly creating a guide to GitHub book for O'Reilly where I actually use this process to write the book. So sort of as a pilot project for it. So I'm doing that and the book simultaneously so that I can make sure that the process is good. But I mean, there's a lot of other things for writing technical books, handling translations, um, pegging versions of the book to versions of the, of the application that you're trying to document. Um, you're taking errata, you know, right, all that stuff. And, and every technical book publisher does not do this well, basically. I mean, they have different variations of how they do this, but, um, a lot of it is doc book, which is better in that you can,
Starting point is 00:49:05 you know, it's text and you can merge it and stuff like that, but not very easily. Or most of it is Word documents, and that's just awful. I have to kill that. My goal in life is to kill Word documents for technical publishing because it's not necessary. It's so overkill and bad. You have to lock the chapter sort of one chapter at a time through like soft email locks and say the technical editor has this chapter now. And that's just horrible, right? There's no reason that shouldn't be mergeable and you shouldn't be able to get line by line changes. Absolutely. We're writing a book on SaaS for Manning and Jason Williams is writing the RabbitMQ book.
Starting point is 00:49:46 Luckily, he's trailblazed a lot of this for me where I'm writing in Markdown as well, kind of like what you were doing. But it's a crazy tool chain with Haskell and some other tools in there. We just need to find some sort of standard that not only for open source books and e-books, but even all the publishers because I've done
Starting point is 00:50:01 three different publishers and they all have a different workflow. Yeah, I mean, eventually I'd like to see, I think ASCII doc is a fairly good sort of text standard for that, because it outputs to DocBook, and there's a lot of tool chains that will take DocBook and give you nice looking PDFs and that sort of thing. So that's what I'm concentrating on, is having some Rails type thing for writing books, where you can say, get ScribeInit, and it gives you a layout for how to write the book. Here's where to put images. Here's what the ASCII doc looks like.
Starting point is 00:50:29 Here's a cheat sheet for ASCII doc. You just commit there. You push to GitHub. We generate EPUB, MOBI, HTML, chunked HTML, that sort of stuff for you. And you don't have to worry about... The authors don't have to worry about any of that. The author's job should just be writing words and nothing else.
Starting point is 00:50:48 And there's no tool chain for that right now. And everybody makes up their own. So if you're an author and you go back and forth between different publishers, it's a whole new game of horrible, right? All right, so one last question before we're running out of time here. Who is your programming hero? Um, everybody that works at GitHub is basically my programming hero. Um, it's actually really embarrassing because, um, you know, I, I've, I don't know, I've been, I've been working, I've been doing computer programming for 10 or 12 years, I guess, 10 years, about 10 years, probably. And, uh, you know, most of the places that I was at,
Starting point is 00:51:27 I kind of felt like a lot of these guys... I had a lot to teach everybody, especially at the beginning when you're sort of the arrogant right-out-of-school guy. You know, like, you guys are all idiots. This is how we do it. But now at GitHub, it's the first place where I kind of feel like everybody that I work with is smarter than me.
Starting point is 00:51:46 Um, and, uh, it's, and I, I think a lot of the other guys kind of feel that as well. So it's just a high quality place, but, um, I, I'm constantly looking at Chris's code for examples of how to, how to do stuff. Like if I say, I, you know, if I'm saying I'm writing some command line thing, I look for something, you know, at, at rip or something that, that Chris has written, um, uh, or Tom or something as a command line tool and say, what are the tools that they were using, you know, to, to do this. And, and because, you know, they're, they're all really, really smart guys. So, um, and then everybody that we've been hiring after, I mean, we were all sort of more generalists. Um, everybody that we've been hiring since then is,
Starting point is 00:52:22 are, are so, you know, laser focused. I mean, Ryan Tomako is one of the smartest guys that I, you know, I know. And so it's, it's almost embarrassing to hire these guys because then they go through and look through your code and you know, you just, you don't want that to happen. They're like, what were you thinking? And I'm like, I don't remember. So, yeah, nowadays it's a lot of the newer guys that, you know, are really, really smart going through my code and telling me what I did wrong in the first place. But, yeah, so I can basically learn from everybody at GitHub for a long time to come because they all have – they're all different in different ways, right? Different – I mean, Ryan's, anyways, that's, yeah. Awesome stuff.
Starting point is 00:53:10 Well, thanks for taking the time. That was a horrible answer. I'm so sorry. No, it was perfect. Thanks for taking the time today, Scott. We surely appreciate it. Yeah, absolutely. This was fun. I found myself for the first time.

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.