Distributed Version Control: A Review

This post is all about stuff that’s only interesting if you’re into programming. Read at your own risk!

Next year as part of my degree I’m working with a partner to create some software that’ll simulate cold, dense plasmas (the physics kind, not the blood-is-made-from kind) and the thought of working on this by emailing files to each other and the like just seems utterly beyond tedious, so I’ve started investigating various types of source control, which will make it a lot easier to work together and keep in sync without getting rapidly into a horrible mess.

I quickly ruled out CVS and Subversion, because they’re big, clunky, and require a central server, which we don’t have.

This left the newer distributed source control systems, such as Git, Mercurial and Bazaar. Choosing between these three is stupendously hard, as they all have a pretty robust feature set, and they’re all used by major projects so you can be sure they’ll be supported. Git is used by the Linux kernel and Ruby, Mercurial by Mozilla and Python, and Bazaar by Ubuntu.

Honestly, I haven’t looked much at Bazaar. I can’t even tell you why, but it does rather seem to be the runt of the litter. Nobody much talks about it, which immediately raises concerns.

So on to Git and Mercurial, which I really have had to pick between. One comparison (courtesy of Matt) is that Mercurial is Bond and Git is MacGyver.

I can see where the guy is coming from.  Git was born from the fires of the Linux kernel project, and that legacy is enduringly evident. It seems mostly to be chunks of C held together with strings and tape and shell script, but somehow still manages to be fast as all hell. If you’re comfortable delving into the bowls of a codebase head-first, man pages open in one terminal window, furiously typing out shell scripts of ridiculous complexity in another, you’ll be right at home with Git. It’s like using Linux itself; powerful obtuseness is more-or-less seen as virtue. Sure, it’s hard to figure out, with a whole ton of concepts you need to absorb, but you’ll be a goddamn ninja once you have. A very notable feature is that moving chunks of code from file to file is tracked and version history preserved appropriately automatically. A code move (or file rename) is almost a no-op.

Because it’s tacked together with shell scripts, support on Windows is pretty much pants. There’s some promising work which provides a BASH prompt as well as GUI tools, but it’s still not great.

Mercurial, on the other hand, is written in Python, so it’s generally a lot cleaner. If you want to extend it, you write the extension in Python rather than a hacky shell script, which seems like a plus to me. The core functionality (minus the tracking renames and moves stuff) is completely comparable to Git. One excellent feature is that Mercurial (Hg for short) supports pushing/pulling/cloning code over HTTP/S, so no tedious mucking about with SSH is required. There’s even a very useful command (hg serve) which opens a tiny web server instance which makes it easy to share code over a local trusted netw0rk on an impromptu basis.

Support on Windows is excellent; you run hg.exe straight from a normal windows cmd.exe prompt, no need for BASH like Git. There is also a very complete set of GUI Explorer extensions in TortoiseHg; it feels very mature. I believe there’s also a Visual Studio plugin (which there might be for Git, too).

There does exist a project (Hg-Git) which forms a bridge between the two which is getting more mature all the time, so your final decision isn’t entirely set in stone.

I realise I’m glossing, because this is a fairly complicated decision on which it pays to probably do a lot of research. There are things I haven’t mentioned, like making named branches is cheap and easy in Git, but a pain in Mercurial, which recommends you do clones of the repository instead. On the other hand, this is balanced by Mercurial being a lot less confusing when pulling in code from someone else, their changes just show up as an implicit local branch, whereas Git does something odd with remote named branches appearing in your repo.

All in all, this would be my recommendation:

  • If using Windows is a requirement, use Mercurial.
  • If you want huge amounts of power and flexibility, use Git.
  • If you just want to get on with writing your code, use Mercurial.
  • If you want to collaborate in a small team, use Mercurial.

Otherwise, I’m going to be inspecific, because I really don’t know which one scales up to huge projects better. Apparently, Mercurial has a great extension to do with Patch Queues which helps give it some of the flexibility with patching that Git has, without burdening the core with excessive complexity. I think that may have been wise.

One thought on “Distributed Version Control: A Review

  1. We used Subversion for our group project last year and it was pretty faultless. Have to say I've never even heard of Bazaar, so I don't think I'd go near that. I'm glad you made a post about this, might help me too this year!

Leave a Reply

Your email address will not be published. Required fields are marked *