The Changelog: Software Development, Open Source - Git, Showoff, XBox Kinect (Interview)
Episode Date: February 22, 2011Kenneth and Wynn caught up with GitHubber Scott Chacon to talk about Git, distributed version control, and his quest to kill Word as a book authoring tool....
Transcript
Discussion (0)
Welcome to the Changelog episode 0.4.9. I'm Adam Stachowiak.
And I'm Winn Netherland. This is the Changelog. We cover what's fresh and new in open source.
If you found us on iTunes, we're also on the web at thechangelog.com.
We're also up on GitHub.
Head to github.com slash explore.
You'll find some trending repos, some feature repos from our blog, as well as the audio podcasts.
And if you're on Twitter, follow Changelog Show, Changelog Jobs, and me, Adam Stack.
And I'm Penguin, P-E-N-G-W-I-N-N.
This episode is sponsored by GitHub Jobs.
Head to thechangelog.com slash jobs to get started.
If you'd like us to feature your job on this show,
select advertise on the changelog when posting your job,
and we will take care of the rest.
First up this week, a great organization, Recruit Military,
is looking for a Rails 3 dev.
Familiar with RSpec 2, Cucumber, Sunspot Solar, Rescue,
Chef, jQuery, Backbone.js, a number of technologies here.
Such a great organization that helps find jobs for servicemen and women returning from overseas service.
If you're interested, lg.gd slash 7yankee.
If you're a Houston-based Ruby and Rails developer,
the fresh revolutionary marketing agency Media3 Creative is looking to talk with you.
Actually, it's me who's wanting to talk with you.
I joined Media3 Creative is looking to talk with you. Actually, it's me who's wanting to talk with you. I joined Media3 Creative a few weeks back,
and I'm currently building an awesome dev team to work with,
so check out lg.gd.com or email me at careers at media3creative.com.
And if you live to code where the user meets the app on the front side
and you're open source friendly,
like implementing interfaces in iOS, Android, web, and more, be sure and look up Austin Base, the front side and you're open source friendly. You like implementing interfaces in iOS, Android, web, and more.
Be sure and look up Austin Base, the front side,
but you can work anywhere, I understand.
Shortcode lg.gd slash 8uniform.
Fun show this week.
We talked to Scott Chacon over at GitHub about Git and ShowOff
and even a little Xbox Kinect.
It's quite the range, huh?
It is quite the range.
What was the perspective in terms of what we talked about?
As far as Git?
Yeah, like, was it a lot of Git?
Was it a little bit of Git?
It was probably 90% Git, and not so much GitHub this time,
which was a good mix to talk about how Git compares to Mercurial
and some other distributed source control systems
and how Scott kind of sells it to other communities
that aren't as entrenched in Git as perhaps the Ruby community is
and kind of the heritage that you and I come from
and how he sells enterprises on the need to get off tools like Subversion
and into a truly distributed source control system.
He's been a really good guy in terms of promoting Git over the past few years.
Absolutely. I think he's taught a lot of us what we know about the tool. Also talked about his
show-off presentation app that he hopes to be a keynote killer, where you write your presentations
in web technologies. And then also one of his hobby projects, Connect2B, which is Ruby bindings for libfreeconnect that allows you to control the Xbox Kinect
on your Xbox 360 console from Ruby.
Really, really cool.
A quick word on Red Dirt RubyConf
taking place in Oklahoma City on April 21st and 22nd.
We'll be doing a live episode of the changelog
at the end of day one,
and Wynn's going to talk about day two.
Day two is action-packed, full of training
for some experts.
Ryan Smith from Heroku and Wesley Berry
from the Fog Gym will be doing some cloud training.
Don't miss Optiva doing some
JRuby training, as well
as I will be participating in
some Titanium Mobile training with the
guys at AppCelerator, Marshall and Kevin
over there. But probably the most important
part, catch our bud Eric Michaels over,
and also Nick Quaranto from GemCutter doing some open source training.
So there's a whole track on how to contribute to open source.
So be sure and catch Red Dirt RubyConf coming up in April in Oklahoma City,
April 21st and 22nd.
And registration is open right now, so head to reddirtrubyconf.com.
It was a fun episode. Should we get to it? Let's do it.
Chatting today with Scott Chacon from GitHub. So Scott, I think a lot of the listeners probably
know who you are, but for those that don't, why don't you introduce yourself and your role at GitHub?
Yeah, yeah. My name is Scott Chacon. I've been working for GitHub since almost the beginning.
I started contracting with them when it was still sort of a side project for Tom and Chris
and PJ. And then we all sort of quit our jobs at the same time and started working full
time about two and a half years ago. So yeah, so I've been working at GitHub for a while, and I do a lot of Git documentation stuff.
I'm not very good at C, so to contribute to the Git project, it tends to be more writing and
teaching and that sort of thing. So I do a lot of training for GitHub and doing a lot of conference talks. And I wrote a book called Pro
Git that was published by A-Press under a Creative Commons license that you can get online and
peep code PDF, that sort of thing. So I like Git a lot, sort of weird things you can do with it and
teaching it and getting people interested in it. So that's me.
So how did you come to, to discover Git?
Um, so at my previous job, I worked at a company called Reactrix, uh, which is now, uh, has gone
out of business, but, uh, well, it went out of business a week after I left. So I like to think
that I, you know, a business cannot sustain itself without me, its payroll. So I sort of took it down.
But when I left, or when I started there,
we were trying to do content distribution for things,
for these devices.
And so we were using,
we would just create an RPM of the software
and we'd, you know, SCP it over.
And that was very, if we change one file, we'd have to
create a whole new thing. And so there was no incremental transfer. It was very, uh, it was
very difficult to do, wasted a lot of time and, and, and bandwidth and stuff like that. So what
we wanted to do was something like, um, our sync. And we found out that get was actually a really
good sort of our sync for what we were trying to do. So we would put everything in Git, and we would create these custom trees
of just the content that was needed out of the system,
and then have the client fetch it.
And we didn't actually have to have, even though we had hundreds of clients
that all had to have different combinations of that content,
we didn't have to have hundreds of subdirectories with just the content
that each one needed so we could RCP just what it needed. What we would do is we'd do it artificially in Git using sort of the index
and just say, okay, just these five directories and not everything else and commit it. And it
would never actually exist on disk, but we could have the client fetch it and it would come over
and then be on disk on the client. So it was awesome. It actually worked really well as sort of this strange R-Sync sort of replacement.
So that's how I started using it.
So everything that I started using was, we were actually using Perforce as the RCS type thing at the time for the software.
And we were just using Git to do this R-Sync stuff.
So I was sort of fascinated with the system. And as it became used more, I found that
I knew a lot of the sort of underlying plumbing stuff, and not very many people did. And I really
enjoyed it. I thought it was a really cool model. So I wrote the PeepCode PDF first, and that's
sort of how I got into being the Git guy. I also went to meetups here in San Francisco,
which is how I met the other guys, is how
I met Tom and Chris and PJ. And basically every week that I would come, I'd demonstrate
some other language that I'd partially implemented Git in because, you know, it doesn't have
a linkable library or it does now, but it didn't at the time. And so I would be like,
hey, look, I've re-implemented the blob writing and reading mechanism of Git in ActionScript.
And people are like, why would you do that?
That makes basically no sense.
And so it was basically every single week.
It's like I did it in Ruby.
I did it in ActionScript.
I did it in some other language and Erlang, whatever.
And so people were – I think that's sort of how I got the reputation as the Git guy was just I was obsessed with it at all times and still kind of am actually.
So as someone that's recently went back to the corporate scene, a lot of times I'm having to sell folks on why they should throw away Subversion and move to distributed source control.
So as the guy that wrote the book on Git literally, what do you tell people when they are considering a distributed system?
Well, you know, I tell them that it's faster. I tell them that their developers can work better.
I mean, it depends on who you're asking, right? If you're asking a developer, if you're asking
somebody that's, you know, making the purchasing decisions or something, but,
but, you know, having more efficiency for your developers, they can work offline,
they can work off VPN. All of the commands are faster. Branching and merging is easy to do. And it's
a very common operation, which is not common in basically almost any other version control system,
especially ones that people are switching from. But the offline stuff, I've been to places where,
you know, they have ClearCase or they have Perforce or Subversion or something, and their
system goes, especially like ClearCase, their system, their main server goes down or their network goes down for a
little while and basically everybody has to stop working completely, right?
And it's not as bad in Subversion where you can at least keep coding even though you can't
commit and stuff, but almost everybody's been bitten by that, you know, or they lose the
database and they have to recover it or something and you you tell them you can do everything offline in Git.
Everybody that's working on the project has a full backup of the system.
There's no single point of failure.
It's easy to, if the server goes down, to put up another one.
Everybody can keep working off that, branching and merging.
One of the big ones that I see people light up when I explain it to them
is the continuous reintegration.
You can continuously reintegrate branches in Git,
and that's very difficult to do in most other systems,
especially Subversion, even with the merge tickets that they have recently.
You can create a branch for changing your database backend
or adding translations to your system or something that takes a long time
and generally would be this merge hell that everybody would have to go through.
And you can just be on that branch and continuously reintegrate the master branch into it very easily.
And at the very end, just switch back and do a fast forward merge from master to whatever the branch is and get all of that stuff.
And if you're merging every day, you only get, you know, 24 hours worth of merge conflicts at a time and not, you know, this huge.
There are 50 files that have conflicted.
It's, you know, if you're good about it, it's impossible to do that. the merge conflicts at a time and not this huge, there are 50 files that have conflicted.
If you're good about it, it's impossible to do that. So when I demonstrate that sort of stuff, that's when people really embrace it.
And I think that's how most of us got really interested in the Ruby community about it,
which sort of embraced it early and fast, is we would do demos in the conference,
in sort of the side rooms of the conferences,
saying, look how cool this is to create branches and switch back and forth between them real fast
and merge them back and forth.
And it was so ridiculously easy when you're actually watching that,
that you can't not see how that would be good for your team and good for your development practices.
So as you've been going through this training with all these corporate clients and everything,
have you found it really difficult to sell the concept of Git for the people who are really fond of having a really federated
system where no one can touch their code unless they're authenticated via their exchange server
and systems like that?
Well, not really.
I mean, and that could just be because of the clients that I'm doing, right?
We're not, GitHub is not sort of selling into corporations and saying,
you should be using this thing.
We don't have salespeople that go out and do stuff.
It's always generated from within.
It's from developers that are using Git for open source projects on GitHub or something,
and then they want to use GitHub internally.
So they look for GitHub Enterprise, like our firewall install client, where you can buy it and run it inside your firewall.
And so they come to us for that because they want to do that.
And then they say, as long as you're doing that, you want to come and do some training as well.
And so we'll either throw that in with that.
So they've already embraced it in some way.
Or the other one that I do a lot is large corporations that do sort of Android development, so like big telecom type companies.
And so they want to be involved in the Android ecosystem, and it's using Git, so they sort of have to use Git.
It's very rare that we won't really go in and schedule a meeting and go in and sell people on the merits of Git.
It's more of a developers love it, and they use it in their off time, and then they try and get it into their company because there's a need for it, right? It's painful to use another
version control system if you're using Git in your spare time. That's true. You know, one of the
ideas that I had last week was, you know, these adopt the highway sections of a highway where
they have a local group that goes out and just picks up trash on the highway or whatever. I
think we should all go out to Google code and some of these other places and just adopt
a repo that's in Subversion and just pull it over to GitHub and mirror it.
Yeah, I mean, we've tried to make it kind of easy to do Subversion imports, but the
problem with Subversion importing is that, you know, changing from any version control
system to another is that it depends on the history of the system, right?
Like, really simple ones aren't that difficult, but, um, I've been to a bunch of companies
that, you know, have these really complex histories where they even moved from CVS to
subversion and then they've been in subversion for years and they have hundreds of thousands
of commits and, you know, they don't know how they want to split it up or, you know,
they've added a large file and then removed it again.
And so that import, you know, adds the big file into your clone and stuff.
And so a lot of times that, adds the big file into your clone and stuff. And so a lot of
times that has to be sort of custom. I've seen people write custom, you know, importers with
like Git fast import, which is, you know, an incredibly time consuming process. But,
and then I've seen other companies where they just take the last snapshot and put it into Git
and they're like, screw everything else, let's just go. So it's so highly dependent on the team
around whatever the project
is you know yeah whenever i have to do that i usually just do get us the end clone and it'll
if it's a big tree it'll take hours and hours and hours it's terrible
so so how pimped out is your is your git config file um it's not it's actually not um largely
because i do so much uh training and evangelization and stuff,
I don't want to have a very custom setup locally
where I'm typing commands that they can't type or something
if I'm trying to demo something.
So for a long time I had no Git aliases.
I'd have to type out everything all the time.
And I had no Bash aliases. So you know,
it wouldn't actually just most of the people that get up can type like, you know, GCI or something,
and it doesn't get commit with with options and things. But I try and stay away from that so that
I can just use you know, I have sort of the experience this sort of new user experience
still and I can teach that you know, I remember what all the commands are. I have weakened in my resolve recently or, you know, within the last year or so.
I added a git lol, which does a git log graph decorate one line.
And then, you know, so it gives me a nice sort of visual graph so I don't have to use git k.
And git st, which I use for git status dash S dash B, which gives you a short status
in the newer versions of get sort of like the, the subversion looking output, which
is like question marks next to each name that's, you know, on track and things like that.
Um, and that's a lot nicer looking than the sort of verbose get status output.
So those are my two cheats, but other than that, um, and I, I think I put a custom font
in for get gooey and get K. Um, but other than that, I don't really have very much in there because I don't want to cheat.
So do you use a lot of external tools with Git, like TIG or GitX at all?
No, I don't.
I use Git GUI every once in a while, which is sort of the committing interface for Git on a GUI.
If I have a whole bunch of stuff that I've done and I want to break it up
into three or four commits and be really specific about it, because you can do line-level commits,
sort of patch, like git add-p, you can do that, but you can do it on a line-by-line basis,
which is a little bit nicer, so you can sort of go through that real fast.
A couple of guys at GitHub use gitx to do the same thing, which has a really nice interface for that as well. But again, I try and use whatever comes
with Git when I can so that I can teach it a little bit more broadly. So why Git and not
Mercurial? Why Git and not Mercurial? So I have done a little bit of work in mercurial um i did a plug-in for mercurial called hd git which allows
you to uh i started it which allows you to uh commit uh in mercurial and then push to a git
server so you can use mercurial and then push to github for example to to put the code on
and then people don't you know necessarily it uses git as the transport port mechanism so
github doesn't know that you're using
Mercurial Client for it and
it's a one-to-one
conversion ratio.
Every object in Git has sort of a...
or every commit in Git basically has a
one-to-one relationship with
a commit in Mercurial.
And they're very similar. So when I was
writing that, I had to learn
sort of the back-end systems. How does it store its data?
What does the actual sort of format look like?
How does it think about the data that you're putting into it?
And it turns out that it's actually incredibly similar.
The main difference is how it actually stores it on disk.
It's not the objects themselves.
The objects themselves are actually incredibly similar, and it's not that difficult to go back and forth between them,
which is why branching and merging is just about as easy in Mercurial. A lot of stuff is the same.
So what I like to do is say, use whatever client you feel more comfortable with. I feel more
comfortable with Git because I like the branching model better. But recently, Mercurial has bookmarks,
which are very similar to the Git branching model. So if you want to use bookmarks, then you get
sort of the same thing. But other than that, they're incredibly similar systems. And so, you know,
the HG Git plugin is a nice thing because then we can say, use whatever client you want,
use Mercurial, use Git, push to Git, have everybody can work together and nobody really
needs to know that other people are using, you know, whatever client they're most comfortable
with. But yeah, I mean, I used Git because the backend system,
originally, like I was saying, I was using it in a more low-level way.
And the backend system gives you a lot more power in Git.
It's a lot simpler.
The Mercurial one is much more complex.
It's sort of a hybrid between the Subversion model and the Git model,
where the Git model is just, here's all these objects in a database.
It's sort of a key value store, don't care.
And Subversion has this file-based log system where you have versions of each file in
a name of that file or file named after that file name. And in Mercurial, it's sort of like that.
Like you have for every file you've ever had in your system, you have this file.i. It has a log
of every version of that file. And so it's a lot more like if you rename a file or remove a file,
you still have to have that log there and you have to have rename links and all this stuff.
It's much more complex and gets super simple.
It's just here's a manifest and a commit, and here's all the objects, and we don't really care.
We don't track renames.
We figure it out after the fact.
So that worked for what I was trying to do with the low-level stuff. And Mercurial, it's basically just a version control system, whereas Git, you can use the backend for basically anything that you can think about using for a version
POSIX file system, because
that's basically all that it is.
I really wish that there was a GitHG
plugin, personally, because I have
been involved in the Python community
every once in a while. I find someone
real stubborn who's working on
Bitbucket, and I have to push up to it, and it's
frustrating.
But I actually just watched a talk by you recently where I didn't realize that when you're using HG Git, it actually has a full Git repo inside of it and they can just clone
off of the bare repo in there and then work with that.
Yeah, yeah.
There are a couple of people that are using HG Git to do the opposite.
So HG Git's Augie Fack and and uh a couple other people have sort
of taken that over and and got made it a lot better than it originally was when i was working
on it i i kind of haven't been working on it for a while but it's great and a lot of people use it
now um and he made it really really fast a lot faster than it was when i was doing it but um
but yeah so a lot of people will use it where they'll use it sort of the opposite,
where they'll use it to take their Git stuff and put it into Mercurial
and push it just via the normal Mercurial thing,
because it does bidirectional conversions.
When you clone from Git, I have to turn them all into Mercurial object,
and when you commit in Mercurial, I have to turn them into Git object.
So it can do both ways.
And it's not that difficult to set it up the other way,
but it's not built in, it's not super easy
like it is with the Mercurial side of it, right?
Sounds like a good contributor could add that, right?
Yeah, I would look at it.
I don't remember.
There's, I think, a couple of people
have added some things to it
to make it relatively easy to do that sort of thing,
but it certainly doesn't ship with Git, right?
So we cover quite a broad range of listeners.
Do you want to go over some of the basic differences?
I've heard you talk of this before,
of why Git is being used by who it's being used by
and Mercurial at the same time.
Not that one is superior to the other in any way, that they're actually quite similar,
and then why one's becoming more popular than the other.
I guess one of the, I kind of considered a mistake at this point,
but I'm too lazy to go back and redo it, is I made a website called whygitisbetterthanx.com,
and I put a bunch of other version control systems and basically just summarized
for people that are saying, why are you using Git? Especially, you know, a couple of years ago,
people are like, why are you using Git? And so I wanted to summarize, this is why we chose Git.
This is why people that use Git chose Git. But the problem that I did was I put a lot of different
version control systems on there. I had it comparing to, to Mercurial and to Bazaar,
which are other distributed version control systems. And since then, um, you know, all the email I've gotten back is not defending subversion
or Perforce or the other ones that I compared it to. They're all defending Mercurial and,
and sometimes Bazaar. Um, and, uh, and so I've, I've sort of changed my message to, to be,
you know, we don't care what you use. Everybody should be using distributed version control
because there's still a huge, huge, you know, population, especially in the corporate
world that's using subversion for, or centralized version control systems for stuff. And I think in
most of those cases, it would do, you know, it would be better for the entire development team
there if they were using a distributed version control system. And the reason why, largely,
besides just offline work
and stuff like that, is branching and merging.
If people are using
CVS or subversion
or any of the
RCS derivatives,
any of the centralized version control systems,
I guess I should say,
they have a different mentality of how
to develop, right?
And if you're using a distributed version control system, because you can sort of craft your
commits, you can think about it a little bit more, you can do stuff offline, you can decide when to
push and share with people, you can do branching and merging very easily. And it's very lightweight.
It's something where you say, I'm going to make a branch for every ticket that I'm working on or something.
Somebody in a centralized version control system, like if you're doing subversion, that would make no sense, basically.
It would be so much overhead that it wouldn't be practical.
And in Git and in Bazaar and in Mercurial and distributed version control systems, those are sort of the top three.
That makes sense.
And so I want everybody to be
doing that. I want to, I want the mentality of the entire development community to be,
you branch first, you do stuff in branches, you merge it in when it's ready. And when we can get
people from Subversion over to any distributed version control system, that mentality changes.
And I, I did not have a hard time working in Mercurial, right? I mean, I, when I was doing it,
when I was writing HG Git plug and I did everything in Mercurial and, and I mean, when I was doing it, when I was writing HG Gitplug, and I did everything in Mercurial, and I thought it was fascinating, but it was not difficult to do.
It was not nearly as difficult as moving from Subversion to Git or from, you know, what was I
using before that? RCS, I guess, or, you know, from Subversion to Perforce or something that's
sort of really different, right? I think once everybody's in that mentality of what they expect their version control system tool set to be
and how they expect to work and the efficiencies that they expect to get out of developing
and how they expect to collaborate, right?
It doesn't really matter which of the three it is because that's your mentality
and you just have to remap that onto something slightly different.
A user interface is slightly different, right? So that's been mentality and you just have to remap that onto something slightly different. A user interface is slightly different. Right. Um, so that's,
that's been the push is not alienated people that are,
that are Mercurial users because I, you know,
I actually like Mercurial to a certain degree.
There's a lot of interesting, uh,
development decisions that I think were made smarter than get and a lot of
ones that I think were not. And, and I actually, I love, you know,
drinking with people and talking about that for a really long time because I can, it at least to a fair depth because I've been doing both of these.
I actually had a lunch one time with Augie, who's the guy that has been maintaining the HD Git plugin, who's a Mercurial hacker and has always been a Mercurial guy, and me and another friend of ours. And the three of us basically spent the entire time drinking and
eating pizza and talking about the differences in the transport protocols between Git and Mercurial.
And I was like, this is a conversation that can only possibly be interesting to basically the
three of us on the planet. But I love it. I think it's really interesting. But the point is that
once you get into distributed version control, I think that that is the future of development. And the sooner that we get more developers over there, I mean,
the better it is for obviously GitHub, if people are using, you know, even if people are using
Mercurial, that's better than people using Subversion, right? They're closer to using GitHub,
or to being involved in an open source community that embraces that development style.
And that I feel like that's better for the open source community in general, right? Getting off of this. I mean, that's the other thing.
Sorry, I'm sort of ricocheting here. But that's the other interesting thing is that, you know,
I mean, our interests are more than just GitHub. Our interests are the entire open source community.
We want the open source community to be vibrant and to be interesting because that's who we all are. That's who basically everybody at GitHub came from,
right, and how we met each other. And so we want the open source community to thrive.
And I feel like distributed version control systems, it's much easier to thrive as an
open source community using that. When you're on Subversion, everybody has this sort of read-only
thing and you can read it and you can improve it locally, and if you want to go through everything, you can extract a patch and mail it to a thing and go through that whole thing, and it's really heavyweight.
And then if you do that enough times, maybe they'll give you a commit bit, and then you can actually push stuff into the repository.
You can actually commit something to the repository, and everything's so heavy, right?
It's so difficult to get involved. And in Git, I feel like, or even in Mercurial,
but in any of these distributed version control systems, because you can have these,
you know, these sites like GitHub, where you can create a fork and have your own write permissions
and share stuff without having to get the sort of blessing of the entire community and craft
stuff that's nice and send it back. And everything's very easy, right? I mean, that whole process is so much easier and you have
so much more power in it doing that. And if you don't get it back in, you can still keep it up
there, right? It's easier for everybody. So that's, I feel, a little bit less true in
Mercurial than in Git. I think that it's easier to do forks and stuff in GitHub than trying to do like patch queues or something in Mercurial.
But it's certainly easier than in Subversion, right?
So that's the other push,
is we want the open source community to be on one of them
so that it's easier for us to collaborate
and open source grows faster.
Absolutely.
I think the big thing that GitHub did
when they decided to build the system that you guys have is to take the projects and make the name, the namespace that everyone shares, your username rather than the project itself.
So there's no GitHub slash whatever the project name is.
It's username slash project name.
And that's what really enables people to be able to make it.
It turns it from being a technological problem to a social problem.
Correct?
And the other nice thing about that is that you don't have squatters, right?
Because you have your own namespace.
You don't need to, you know, try and squat a name that you want like you do on even like in the Ruby community, the way that you get gems out is with Gemcutter, right?
And how you used to do it was Ruby Forge.
And so if you want a gem name, if you want a project name, you have to sort of squat it while you're working on it, unless you sort of work on it in private and put it up there or put up something that isn't really quite ready yet or something.
But there's still a little bit of squatting, but it wasn't as bad as like Source Forge or something where, you know,
half of the projects are dead because just because they thought of a cool name,
they're like, you know, they're like,
backscatter, that sounds amazing, let's do a thing.
And then you put it up there, you know, this is what this will be.
And then like 90% of the time it never happens, right?
And in GitHub, maybe you create a project name,
but you don't have to really squat it.
You're not taking it from somebody else that could do something cool with it, right?
You know, that's so true.
On RubyGems, it cracks me up to 404 pages, page not found,
but then it says, it will be mine.
Oh, yes, it will be mine.
Sounds like me in domain name purchases.
Yeah, well, that's a whole other thing that gets me angry, too.
So the whole Git ecosystem right now,
is there anything that really gets you excited, like the development of libgit2 and other projects like that?
Yeah, well, I mean, the development of libgit2 certainly gets me excited because I'm sort of directly involved in it.
But it's something that the Git community has needed for a long time, is a linkable Git library.
Because the library, there is a libgit.a that is produced by building Git itself, but it's not reentrant. So
if you link to it and it gets to a certain point and it does this all over the place,
it was built as sort of a command line tool. So it'll just call die. And so your program,
whatever it is, will simply die if it gets to that point. And so you can't really use,
there's no stable defined API that won't change. Everything just sort of changes all over the place. It's, it's sort of a mess. Um, and the tool is great and there's a
ton of really smart people working on it, but, um, but there's no linkable library. So you can't
really build like a GUI on top of it, which is why they were slower to come. Um, and, uh, and so
the LibGit2, which is the linkable library that's reentrant and has a stable API and all that stuff, has been in the works for years ever since.
I think it sort of started when I went to one of the get-togethers every year after the Google Summer of Code conference.
A lot of the Git people are around, so we do a get-together where all the Git developers get together and talk about stuff. Um, and I was showing,
I was basically showing people all of these different implementations that I
was talking about that I had done of get in all these languages,
like,
you know,
in Ruby and,
and I helped with some of the Python stuff,
I think.
And,
um,
and I did one in Erlang or possibly two in Erlang.
And I did,
uh,
one in action script.
I was showing all this stuff and it's like,
this is necessary because there is no linkable library,
right? Um, otherwise we could be building showing all this stuff, and I was like, this is necessary because there is no linkable library, right?
Otherwise, we could be building wrappers and nefs and stuff.
And so the project sort of started, but it never really went anywhere.
And then last year, for the Google Summer of Code, somebody put up a thing that they would be interested in working on it.
And I became the mentor sort of by default.
I wasn't really planning on doing it, but him and I worked together and then he got really, really far with it. I had a great
student, Weiss, and he got really, really far with it and it became really usable. And
so GitHub decided to just keep paying him basically to keep working on it. So it's sort
of the indefinite Google summer of code where we took Google out and then replaced it with
GitHub. And then, you know, he's still a student where we took Google out and then replaced it with GitHub.
And then, you know, he's still a student and we keep paying him to work on it.
And we've gotten a couple other people.
Jeff King from the Git community
is a really huge Git developer.
He's sort of, you know, partially working on it as well.
So now GitHub is sort of driving the development
of this LibGit2, you know, library
where we can use it in stuff that we, you know, on our
backend and stuff, which would be really nice, um, uh, for us, we're, we're doing a Ruby wrapper
for it as well. Um, and we're getting contributions like a Python wrapper and a, um, uh, dot net
wrapper and an objective C wrapper and stuff. So you can use it from all these different languages,
which is, you know, sort of historically been another thing that's nice about Mercurial is
that you can write tools and stuff where it has this nice API
and you can sort of extend it. And now I think LibGit2 is getting, it's almost far enough along
where, you know, we'll have wrappers where it's just as easy to write something in Python using
Git as it is using, you know, Mercurial, even though Mercurial is written in Python.
But then you could also do it in Ruby or in shell scripts or in Objective-C or
in whatever language you like, right? I mean, we have like Lua wrappers or something for it. So
that's what I really want to get to, where there's these nice APIs in almost every language on this
nice, fast, stable, reentrant, thread-safe library. So that's one of the things I'm really
interested in, is not just the
development, because I'm not great at C. I can't really do the code. In fact, the way that I was
doing this Google Summer Code stuff was I would define what I wanted the API to look like in Ruby,
basically. He would write it all in C. I would look at the.h files, write the wrappers in Ruby,
and then write the unit tests in Ruby to see if the stuff that he wrote in C worked or not,
which is possibly not the best way to be doing that,
but it was a lot better than me actually trying to look at his C code.
And that's largely kind of how we still do stuff is I make sure the rugged Ruby wrapper works for stuff
and that I can build the things I want to.
But then evangelizing that and saying, you know, when I go to companies or when I go to talks or something, saying, here's this cool library with all these bindings, write something cool with it because the backend is incredibly flexible, right?
It's basically just this key value store and this, you know, sort of linked list of snapshots of manifests of this file system.
And you can do whatever you want that, you know, that syncs well and easily and incrementally. And so you can do anything you can think of that,
that would use a structure like that you can do and get, it doesn't just have to be version control.
Um, and so that's, that's what I'm, I'm, I'm really interested in the next couple of years
because we're going to have, you know, LibGit 2 and all these nice bindings. So everybody can
write all these cool scripts and stuff that do all this custom stuff but then also um e-git um you're you're asking about stuff i was excited about
and get e-git um is the the eclipse get plugin and the eclipse project has has sort of embraced
uh get as their next version control system basically from cvs they never really embraced
a version that well um and so they're all working on this, this EGIT plugin for, for Eclipse that's, that does,
you know, everything where you don't have to install Git. You can simply install this plugin.
It has a pure Java implementation of Git in it. And, and you can do everything in there. NetBeans
has a great plugin now as well for their editor. So, you know, all this stuff is coming online.
All these GUIs are starting to get written. Git Tower just went 1.0 yesterday, I think,
which is a nice professional paid-for Mac app, Git GUI,
that I've seen a lot of people using and liking a lot.
So yeah, anyways, as all that stuff happens,
as all the GUIs get developed,
and as these scripts get bindings that are fast and capable
and have this nice API to them,
I'm really excited to see what people are going to be doing with Git, right?
So my job now is not so much doing sort of the proof-of-concept stuff,
although I do do that a little bit with some things like large file support
and things like that, but mostly telling people what's out there
and then seeing what they do with it, you know?
So as GitHub has become more and more popular, a lot of your users aren't necessarily experienced
with source control systems, and I've found that a large number of the more, you know,
the beginners don't understand Git as a concept fully.
It's just a natural thing that happens.
Is there anything that you feel that the whole community really needs to take the time to learn, uh, in general that you can think would help them a lot,
like, you know, learning what a rebase actually is and things like that? Um, I'm, I'm sort of
split on that. I'm not really sure that, I mean, what I like to do is teach sort of basic concepts
of what Git is trying to do. Because a lot of people, especially from the
developer community and some designers and stuff as well, have come from this aversion world where,
I mean, the interesting thing about version control is that most people for a long time
don't take it seriously. It's not taught in universities, really, which might be part of
the problem. You know, I was never really taught version control when I was at university,
and that was fairly recently. You know, I mean, I graduated in control when I was at university. And that was fairly recently, you know, I mean, you know, I graduated 2002. And I went to UCSD, and
they, they didn't really teach version control anywhere. And it certainly wasn't, you know,
it certainly wasn't sort of presented as a tool, right? They taught, they taught programming,
they taught languages, they taught assembly, they taught, you know, all this stuff, but not version
control that was not really considered and even like editors and stuff, right? But that version
control wasn't really considered a tool set that was important.
And I think that's sort of gone through a lot of the industry is a lot of people don't,
they see it as sort of a necessary evil, right? You have to have it so that you don't lose
everything. Not this is a tool that can make you better at your job, right? Or can make your life
easier as a developer. A lot of people don't see version
control that way and it may be because it hasn't really been like that as much whereas i feel a
git even though it's sort of complex i mean it is more complicated you can do very complicated
things but i think it's worth investing the time to learn it to get a book to read it i mean you
know i i have stuff that's you know i've that that's free and. I've been trying to do a lot of evangelization for Git itself,
but also, you know, writing stuff down so people can learn it as easy as possible. But I feel it's
worth, like people think that, you know, subversion, they're just like, okay, here's the
eight commands you need, and that's it. And they don't really learn it in depth, right? And they
kind of want to approach Git the same way. And I feel like it's important to learn it, to say this is a tool set that is as important as learning an editor, right?
As learning VI or learning Emacs or learning, you know, Eclipse or, you know, everybody spends hours
and hours learning their editor. Nobody uses Notepad to do programming, right? And Subversion,
people use it like Notepad. They're just, okay, I'm at some point, I'm going to commit, and that's it.
And I feel like there should be more, people should take it seriously as a tool set that gives them power, right?
That gives them a lot of power.
Like learning Emacs as a power user or something gives you a lot of power.
Learning Git gives you a lot of power and that that should be a focus of of places to make sure that people and even of schools and stuff to make sure that that that people see that as as that tool and not as
a necessary annoyance i guess um i mean i can do get in an hour and i do that a lot but but i like
the ones where it's all day and i'm teaching a lot more stuff on how to think about version control
and how to how to use it as a as a tool that makes you better at what you're actually trying to accomplish, right?
Collaborating with people,
at looking through your history,
at figuring out what happened,
at peer reviewing code,
at doing merges and working independently
on different branches at the same time,
that sort of thing.
So, yeah, I mean, I think it's more of a mind shift
that people have to see the tool
as a different class of tool than people used to think about version control, I guess.
I'd like to switch gears for a moment and talk about another one of your projects, ShowOff.
What's the inspiration behind this and what's the state of ShowOff?
ShowOff is, I've been using it for almost all of my, it's a presentation tool.
So the idea behind ShowOff is you write your slides, because I do a lot of talks.
I do a lot of training.
I do a lot of conference talks.
And so I make a lot of slideshows, basically.
And, you know, I mean, a lot of people do.
It's, you know, one of the word processing, slideshows, Excel spreadsheets.
Like, those are sort of the big the big three that are in all the
Office formats because everybody uses them for stuff. I was using slideshows a lot. I used
Keynote for a really long time. It was not bad. It's fairly nice software, but there's a lot of
things I couldn't do with it. One of them is version control. As I'm telling people to take
version control more seriously, I try to make sure that all the stuff I'm doing is version controllable,
especially for like the training stuff. Because if you think about it, if you're doing training,
you know, every couple of weeks or something, and it's variations on a theme where you have a whole
bunch of different sections and some companies want some sections and some want others others and it changes slowly over time or you have to customize stuff for certain
companies it's very nice to be able to branch and merge your presentation right and you can't do
that with with keynote it's it's it's just not really possible to do it and and and be able to
manage it properly and the other thing about presentations especially the way that i do them
which is generally a couple of words on a
slide, you know, I have a sort of bare presentation style, is it's just text, right? I don't have a
ton of animations. Most of the stuff in word processors and in presentation software is 95%
of that stuff is never used by anybody, right? Even if they know it's there, just because I'll use it on one or two slides,
maybe for animation or something like that.
But generally you don't care.
It's just words.
It's, you know, examples, code, things like that.
And so what I wanted to do was
have everything in a basic text format.
So I chose Markdown.
So you write everything in Markdown
and then you run a thing and it creates HTML off of it.
And then it's an HTML-powered presentation, right?
But it's awesome because I can version control everything.
I can have all the different subsections and subdirectories and then move them from slides to slides, or I have a little showoff.json sort of index file where I can remove lines in and out as I do and don't want different sections. And it just makes it as easy to write
my presentations as it is to write code, right? And manage them and share them and have people
fork them and fix them and send me pull requests and all of that stuff works, right? If it's plain
text, anything that you can do in plain text, I like doing in plain text if possible, right?
And my presentations really didn't have that much that I couldn't do in plain text. It's just that
there was no real tools to be able to do it very easily.
There was like Slidey and S5 and stuff,
and they didn't really fill my criteria of being simple and fast to write slides.
And mine is just basic markdown, and it works great.
And then the other cool thing is you can add custom JavaScript, custom CSS.
You can use tools that are used for web development to
do custom things in your slideshows, right? So I put in, I do a lot of, you know, git commands.
So I'll type a git command on the command line and then show the output. And it's very difficult
to do in Keynote and then show off, you know, I just use a jQuery plugin that does typing. So it
looks like I'm typing it at the time and then all output comes in after that. And I don't have to program any of that. I just have to put a style on the slide that says this
is a code example. And it'll just type it out for me as I'm hitting the button. So like that sort
of stuff. You can also do fun stuff like the other week, I've been playing with the Connect,
you know, the Microsoft Connect device. I got one of those and I got the, there's open source
drivers for them on GitHub.
And so I was playing with that on the Mac.
And I made it so I could control
the presentation with
a Connect. So I used
Firewaiter.
You guys are familiar with Firewaiter. It's like a
browser testing thing. So it'll
click buttons in your browser, basically,
for you. So I just hooked that up.
It's like Selenium. What's that?
It's like Selenium.
Yeah, it's Selenium, right.
And so I made a really simple wrapper that just took input from the camera,
cleaned it up, saw when I was doing left to right or right to left movements with my hand,
and then hit buttons in the browser basically that made it go back and forth in the thing.
So that would be difficult to do in Keynote, like to try try and send keynote a signal to go to the next slide programmatically.
And it's very easy to do because I'm using a browser, right? I mean, all of this stuff that
already works for browsers, you can use with your presentation software. Um, so that that's what
show off is a lot of people are using it because it's, it's fairly easy to do. It's easy to, you
know, get up and running and version control. And you can say stuff like show off
Heroku and it'll Heroku-ize it for you and you can push it to Heroku and then your presentation
is on Heroku. Or now you can say show off GitHub and it'll create a GH pages branch. You can push
it to GitHub and we'll serve it statically off of GitHub pages. So you can share the presentation
easily and you can share the can share the presentation easily,
and you can share the source for the presentation easily,
which is really nice.
One of the first show-off presentations that I did,
somebody did a presentation of show-off in show-off,
and then I wanted to do one as well.
I had a little lightning talk that I wanted to do for it.
So instead of creating my own, I forked his,
changed it a little bit to fit the format of the new presentation,
and then I gave a presentation on show-off using show-off using a forked version of show-off.
So I was really happy with the metaness of basically that entire experience.
The other cool thing about that is that, so that's show-off that you're asking about,
but that's also an interest of mine is making tools for things that don't need huge, heavy, overdone GUIs for them.
That's what the entire industry uses, right?
So presentation software is an example.
Word processing is another example.
There's a lot of stuff that you need Word for, right, that does really complex stuff.
But there's a lot of stuff, most stuff, that you don't.
So I wrote a book for A-Press, the pro-git book I wrote for A-Press.
And basically the process for that was you write everything in Word.
They give you a style sheet that the publishing tools know about.
So you have to stay within these like eight styles, right?
So already you have a lot of constraints on that, right?
So it's not like you can do anything in Word.
You can't really use Word as the full tool.
You can only use these 10 styles.
And so I felt that that was really dumb
because it's basically just a bunch of words,
inserted images,
and then everything's constrained within these 10 styles.
Why are we not using Markdown, right?
Or Showoff, or I mean, not Showoff,
Markdown or Asciidoc
or some structured markup language
that is very simple and does this thing simply, right?
So I wrote my book in Markdown.
I had to export everything from Markdown to Word for the copy editing phase and then export everything from Word back into Markdown to publish the website at the end, which was one of the most horrible experiences of my life generally.
But what I'd really like to see is a tool chain for technical authors for writing books
about open source projects, for writing just normal tech books like ProGit, like normal
technical books that all of us read a couple times a year probably.
I'd like that entire process to be much simpler.
It's because there's thousands of authors doing these books,
and there should be more.
There should be a small manual for every open source project, basically.
I think it would be really helpful to have that,
to have a Rails manual for every project.
And it's not really done because the authors have to come up with all
this stuff. They have to create a website for it. They have to create, you know, figure out how to
generate a PDF or a Mobi file and EPUB file or, um, you know, all of the different publishing
standards. But if you want to read it on your Kindle or your iPad or something, right. Um,
so that's one of the projects I'm working on right now is, is trying to do that for not just word
processing, but like writing, writing books or writing manuals or writing novels or anything that doesn't take, it's not a children's book,
right?
Anything, anything that, that has text, a couple of styles and, uh, maybe some code
examples or some math formulas and some images and that's it.
Right.
That's good.
What's that?
Yeah.
So that's good scribe.
So I'm working on that right now. And I'm actually sort of in the process of possibly creating a guide to GitHub book for O'Reilly where I actually use this process to write the book. So sort of as a pilot project for it. So I'm doing that and the book simultaneously so that I can make sure that the process is good. But I mean, there's a lot of other things for writing technical books, handling translations, um, pegging versions of the book to
versions of the, of the application that you're trying to document. Um, you're taking errata,
you know, right, all that stuff. And, and every technical book publisher does not do this well,
basically. I mean, they have different variations of how they do this, but, um, a lot of it is doc
book, which is better in that you can,
you know, it's text and you can merge it and stuff like that, but not very easily.
Or most of it is Word documents, and that's just awful. I have to kill that.
My goal in life is to kill Word documents for technical publishing because it's not necessary.
It's so overkill and bad. You have to lock the chapter sort of one chapter at a time through like soft email locks and say the technical editor has this chapter now.
And that's just horrible, right?
There's no reason that shouldn't be mergeable and you shouldn't be able to get line by line changes.
Absolutely.
We're writing a book on SaaS for Manning and Jason Williams is writing the RabbitMQ book.
Luckily, he's trailblazed a lot of this for me
where I'm writing in Markdown
as well, kind of like what
you were doing. But it's a crazy tool chain
with Haskell and some other tools in there. We just need
to find some sort of standard that not only for
open source books and e-books, but even all the publishers
because I've done
three different publishers and they all have a different workflow.
Yeah, I mean, eventually I'd like to see, I think ASCII doc is a fairly good sort of
text standard for that, because it outputs to DocBook, and there's a lot of tool chains
that will take DocBook and give you nice looking PDFs and that sort of thing.
So that's what I'm concentrating on, is having some Rails type thing for writing books, where
you can say, get ScribeInit, and it gives you a layout for how to write the book.
Here's where to put images.
Here's what the ASCII doc looks like.
Here's a cheat sheet for ASCII doc.
You just commit there.
You push to GitHub.
We generate EPUB, MOBI, HTML, chunked HTML,
that sort of stuff for you.
And you don't have to worry about...
The authors don't have to worry about any of that.
The author's job should just be writing words and nothing else.
And there's no tool chain for that right now.
And everybody makes up their own.
So if you're an author and you go back and forth between different publishers, it's a whole new game of horrible, right?
All right, so one last question before we're running out of time here.
Who is your programming hero?
Um, everybody that works at GitHub is basically my programming hero. Um, it's actually really embarrassing because, um, you know, I, I've, I don't know, I've been, I've been working,
I've been doing computer programming for 10 or 12 years, I guess, 10 years, about 10 years,
probably. And, uh, you know, most of the places that I was at,
I kind of felt like a lot of these guys...
I had a lot to teach everybody,
especially at the beginning
when you're sort of the arrogant right-out-of-school guy.
You know, like, you guys are all idiots.
This is how we do it.
But now at GitHub, it's the first place
where I kind of feel like everybody that I work with is smarter than me.
Um, and, uh, it's, and I, I think a lot of the other guys kind of feel that as well.
So it's just a high quality place, but, um, I, I'm constantly looking at Chris's code
for examples of how to, how to do stuff.
Like if I say, I, you know, if I'm saying I'm writing some command line thing, I look
for something, you know, at, at rip or something that, that Chris has written, um, uh, or Tom or something as a command line tool and say, what are the tools
that they were using, you know, to, to do this. And, and because, you know, they're, they're all
really, really smart guys. So, um, and then everybody that we've been hiring after, I mean,
we were all sort of more generalists. Um, everybody that we've been hiring since then is,
are, are so, you know, laser focused.
I mean, Ryan Tomako is one of the smartest guys that I, you know, I know. And so it's,
it's almost embarrassing to hire these guys because then they go through and look through
your code and you know, you just, you don't want that to happen. They're like, what were
you thinking? And I'm like, I don't remember. So, yeah, nowadays it's a lot of the newer guys that, you know, are really, really smart going through my code and telling me what I did wrong in the first place.
But, yeah, so I can basically learn from everybody at GitHub for a long time to come because they all have – they're all different in different ways, right?
Different – I mean, Ryan's, anyways, that's, yeah.
Awesome stuff.
Well, thanks for taking the time.
That was a horrible answer.
I'm so sorry.
No, it was perfect.
Thanks for taking the time today, Scott.
We surely appreciate it.
Yeah, absolutely.
This was fun. I found myself for the first time.