Tech

Modern source control using Mercurial

Johnny Chadda

Apr 14, 2010 • 3 min read

Version control has historically always used the traditional client/server model. This means that the server is always the "master", and clients may commit updates to this central repository. The information available on the client is generally minimal, with the base revision for easy diff and status checks.

The two main contenders are CVS and later Subversion, which has taken over most of the market. While Subversion is using the exact same model as CVS, it is more reliable, has atomic commits and is generally easier to work with.

Something new has been brewing these last couple of years though -- distributed version control.

The main idea behind distributed version control is to use a decentralized versioning model (duh). This means that there is no centralized server in a normal fashion, but the repository is instead distributed among everyone who are working on that very project.

This part might be a bit hard to grasp at first, especially if you are a traditional Subversion user, but with the decentralized model the whole repository is right there at your fingertips and not somewhere remote. That does however not mean that there can not be a central place for the project, on the contrary, the central server servers as mere member of the entire mesh of clients. The server essentially becomes a client, but usually with special permissions allowing others to fetch and modify the contents.

Mercurial

This article focuses on Mercurial, which is surprisingly easy to use, but there are of course other version control software that works in a decentralized fashion. The most notable are Git and Bazaar, who have attracted quite the following.

One reason for using Mercurial instead of one of the others is ease of use. The basic commands are using the de facto CVS style standard, while more advanced commands relies on more knowledge of Mercurial itself. This essentially means that most people who have been using any kind of version control previously will feel right at home without having to relearn from scratch.

Workflow

The workflow of Subversion where you first do a checkout from a server, then make all the changes you want, to later finish off with a commit is a usual way of using source control. To mirror this in Mercurial, there are certain not-so-subtle differences to consider.

The first task is to create a local copy of a repository from a server, which is similar to checkout in Subversion. The command is called clone, and that is exactly what it does. Clone makes a perfect copy of the repository on the server, with its entire history, including all available branches and tags. This might at first seem like it uses a lot of extra disk space, but it does in fact not take up more disk space than a Subversion checkout.

After the cloning process has finished, you would start working on your changes just like before. When you are finished, you will do a commit, this too just like before. When you do the commit however, you are only committing the changes to your local copy of the repository (remember, you cloned it earlier on), and nothing is sent up to the server.

You may continue to work offline like this, changing and committing as you go along. When you have reached a point when you want to share your changes with the other developers, you will want to push the changes to the server. That is in fact the name of the command. Hg push sends all your commits to the server, but if someone else has made a push before you, you will be notified by a perhaps not so clear message about you creating new remote heads on the server.

This means that there will be multiple heads on the server (essentially a branch) if you were to force the commit. Normally you would do the opposite and to a pull to get the latest changes from the server before you push your own updates. This is were you will have to take care of merging other people's code into your own. In 99% of the time, Mercurial will solve everything for you. Remember that Mercurial has a complete history of the project, so it knows if blocks of code or entires files have been moved around. This eases merging in whole new level.

When all merges have been completed, you may push your changes to the server and others may pull your changes to their own local repositories.

This has been an introduction to the conceptual differences between classic client/server version control system and the modern distributed kind. If you want to learn more, head over to Hg Init where you can dive into the concept and learn everything your need to know to get started.