Skip to main content

Switched to Mercurial

Posted by fabriziogiudici on July 25, 2009 at 8:50 PM PDT

During this month I've written just a few new code, focusing instead on the conversion of my projects from Subversion to Mercurial (and also converting almost everything from Ant to Maven, and working on a much improved build workflow with Hudson, but these are matters for another post).

So I'll blog a bit about Mercurial in the following month - the target of these posts will be people with no or small knowledge about Mercurial, that might find interesting to know why a conversion is worth while and how the tool can be used in a more advanced workflow.

There were no doubts from the beginning that Mercurial is a superior tool than Subversion, since - like similar stuff as Git - it's a next-generation Distributed SCM (DSCM). This means that there's no more the concept of a "central" repository where everybody commits to and update from, but a sort of peer-based mechanism where everybody owns a complete repository in his local disk, that can kept in sync with others by means of new operations (push, pull, merge ...). Of course, among all the peers there's still a single repository that you designate as the primary, if not because it's reliably managed and backed up (e.g. by a forge such as Kenai).

But being superior doesn't necessarily means that it's the best tool for you. The best tool is always the smaller and simpler tool which is enough for your needs, not necessarily the most complex. The advantages of a DSCM are obvious for very large projects such as OpenJDK, NetBeans or OpenSolaris, with a huge number of committers; my projects are by far smaller, and I have just a handful of committers. So, until a few time ago, I still believed that Subversion was still the best fit. Chris Adamson seemed to be in a similar position about one year ago, and Tom Ball gave some initial answers to that.

The first reason that start floating in my mind is exactly the last given by Tom: asynchronous mode support. This might not be the most important thing, but I love the fact that Mercurial commits locally, so you can work without being connected. You should know that I love to be disconnected, if possible, and other times I can't be connected if I'm traveling (and with the incoming August I'm going to move for a month to the countryside). The requirement of synchronous operations by a centralized SCM such as Subversion is really annoying if you like the "commit often" approach, like me: if you are not connected, either you can't work, or you are forced to change commit style, which is no good. If you have a mobile connection, which is often slow and intermittent, "commit often" works but you loose lots of time while waiting for commits to be completed. With Mercurial, commits are immediate, so you can work as usual. Then, a few times per day, you do a "push" to synchronize the primary repository - this might take a few time, but you can run it while you're doing something else (another job, you're eating or having a bath).

A definitely more important reason is better support for branching. Ok, this is more controversial: of course, even CVS and Subversion supported branches. The point is that managing a branch wasn't easy at all, at the minimum it required a small amount of "bureaucracy" (I mean, coordination between developers), with the result that it was mostly used at "gross grain"; or not used at all (I've personally avoided it as much as possible). Indeed, if you think carefully, branching can be useful at fine grain: every change that you do e.g. to fix a bug might be seen as a micro-branch - usually, a short-lived one; but for most complex cases, the branch might live longer. Mercurial promises to make things easier and I've decided to believe in it - thus, I'll start working putting more branches in my workflow, and I'll tell you how the thing works. Just as a starter, let's talk of a long-lived branch I've just created in jrawio for started working on features for 2.0 - a pack of stuff that needs extensive changes even in the APIs. I started with a command to see the current status:

[Mistral:Projects/jrawio/src] fritz% hg tip
changeset:   289:8c9b3c199622
tag:         tip
user:        fabriziogiudici 
date:        Sat Jul 25 14:37:36 2009 +0200
summary:     SimpleLogFormatter dependency: 1.0.1-SNAPSHOT -> 1.0.2

The tip of a Mercurial repo is the most recent stuff that you added in the repo. In this case, it's a small change in made in the Maven pom. Then I created the branch with:

[Mistral:Projects/jrawio/src] fritz% hg branch 2.0
[Mistral:Projects/jrawio/src] fritz% hg commit -m "Created branch 2.0"
[Mistral:Projects/jrawio/src] fritz% hg tip
changeset:   290:6f5c11c29a31
branch:      2.0
tag:         tip
user:        fabriziogiudici 
date:        Sat Jul 25 22:42:28 2009 +0200
summary:     Created branch 2.0

Note how I didn't have to move into another working directory: everything stays in Projects/jrawio/src. I first changed the poms to reflect the new version - I can sum up the changes with hg diff:

[Mistral:Projects/jrawio/src] fritz% hg diff
diff -r 6f5c11c29a31 jrawio/pom.xml
--- a/jrawio/pom.xml    Sat Jul 25 22:42:28 2009 +0200
+++ b/jrawio/pom.xml    Sun Jul 26 13:59:31 2009 +0200
@@ -6,7 +6,7 @@
     
         it.tidalwave.imageio
         jrawio-all
-        1.5.1-SNAPSHOT
+        2.0.0.ALPHA-SNAPSHOT
     

     jrawio
diff -r 6f5c11c29a31 pom.xml
--- a/pom.xml    Sat Jul 25 22:42:28 2009 +0200
+++ b/pom.xml    Sun Jul 26 13:59:31 2009 +0200
@@ -11,7 +11,7 @@
     it.tidalwave.imageio
     jrawio-all
     pom
-    1.5.1-SNAPSHOT
+    2.0.0.ALPHA-SNAPSHOT
     jrawio-all
     http://jrawio.tidalwave.it
diff -r 6f5c11c29a31 reajent/pom.xml
--- a/reajent/pom.xml    Sat Jul 25 22:42:28 2009 +0200
+++ b/reajent/pom.xml    Sun Jul 26 13:59:31 2009 +0200
@@ -6,7 +6,7 @@
     
         it.tidalwave.imageio
         jrawio-all
-        1.5.1-SNAPSHOT
+        2.0.0.ALPHA-SNAPSHOT
      

     reajent

Then I committed:

[Mistral:Projects/jrawio/src] fritz% hg commit -m "Changed version in poms."
[Mistral:Projects/jrawio/src] fritz% grep "^    2.0.0.ALPHA-SNAPSHOT

Not that everything is still on my laptop. I can confirm that with this command:

[Mistral:Projects/jrawio/src] fritz% hg outgoing
comparing with https://kenai.com/hg/jrawio~src
searching for changes
changeset:   291:a9bd843b7d86
branch:      2.0
tag:         tip
user:        fabriziogiudici 
date:        Sun Jul 26 13:59:58 2009 +0200
summary:     Changed version in poms.

changeset:   290:6f5c11c29a31
branch:      2.0
user:        fabriziogiudici 
date:        Sat Jul 25 22:42:28 2009 +0200
summary:     Created branch 2.0

To sync the primary repo I ran:

[Mistral:Projects/jrawio/src] fritz% hg push
pushing to https://kenai.com/hg/jrawio~src
searching for changes
adding changesets
adding manifests
adding file changesadded 2 changesets with 3 changes to 3 files

[Mistral:Projects/jrawio/src] fritz% hg outgoing
comparing with https://kenai.com/hg/jrawio~src
searching for changes
no changes found

Now, I can switch back and forth between the new branch and the default one (it's called default in Mercurial jargon):

[Mistral:Projects/jrawio/src] fritz% hg update -c default
3 files updated, 0 files merged, 0 files removed, 0 files unresolved
[Mistral:Projects/jrawio/src] fritz% grep "^    1.5.1-SNAPSHOT
[Mistral:Projects/jrawio/src] fritz% hg update -c 2.0
3 files updated, 0 files merged, 0 files removed, 0 files unresolved
[Mistral:Projects/jrawio/src] fritz% grep "^    2.0.0.ALPHA-SNAPSHOT

Last but not least, Mercurial gives you a very high level of control of the repository. For instance, sometimes it happens that you're going to do a large refactoring, so complex that you have to split it into parts because you want to check every step - and eventually make a "savepoint" out of it by committing frequently. The pitfall of this approach is that you get multiple commits for a single logical operation; furthermore, it's possible that during the intermediate commits the code is not compilable at all. Taking advantage of the fact that nothing goes to the primary repository before a "hg push", you can even tweak the local repository so all the commits are merged into a bigger one, comprehensive of all. In the history of the project, thus, your refactoring will appear as a single, atomic operation. Martin Fowler described this technique a few days ago - note how, among other things, the capability of easily creating clones of the local repository makes it possible to experiment and try things before applying changes to the real stuff.

Comments

@claudio, thanks, I'm using ScribeFire, which is a WISYWIG able to publish to a number of blog servers; the problem is that "pre" sections rendered in a different way once on Java.Net.

@fabriziogiudici: There is a firefox plugin, which you can use to edit HTML in a textarea. If you use firefox, of course. https://addons.mozilla.org/en-US/firefox/addon/1449 @nbw: Mercurial support hooks. So you can create a notification hook. There is an example on the mercurial book: http://hgbook.red-bean.com/read/handling-repository-events-with-hooks.ht... Hope it helps !

PS BTW, it's possible to use Mercurial with a Subversion repository. That is, you can configure the thing so commits are local and push translates to a Subversion commit to the Subversion repository. Google for hgimportsvn & hgsvnpull and you'll find something. It's an approach I rejected since two SCM at the same time sounds too much entropic for me, but it could be a way to start trying Mercurial with a current repository without having to convert it.

Noah, of course the email is not generated by commits since they are local; but a message is generated when you perform a "push".

Thanks for your real world observations. I haven't tried hg or GiT for anything real yet. My experiences have been with SCCS/RCS, CVS and for the past 6 years Subversion. The point about detached commits is interesting. One possible draw back I see with this strategy is that other team members do not get commit alerts when you are detached. For example with my current subversion project when a developer commits all team members are alerted via e-mail. Would you lose this capability with the detached commit paradigm? Also as a JetBrains IDEA IDE user this paradigm seems similar to their support of multiple VCS (CVS, Subversion, Perforce, hg etc.) implementations which they do with an abstraction on top of the particular implementation you choose. This abstraction includes the notion of a change set even if the underlying implementation (ie. CVS) does not. It also supports the notion of local versioning which lets you have local history and version control of your files even if your VCS impl. doesn't support it. So for example with my Subversion project when I make changes to files under source control I can see a local history and back these changes out etc. prior to committing to the central repository. It also lets me see what 'incoming' changes are present in the central repository that I will get when I do an update but _before_ I take the update. In fact I can diff and merge these incoming changes before taking the update if I like. All these features are independent of the underly ing VCS implementation which is nice. Thanks again and I'm looking forward to hearing about your experiences with going from Ant to Maven. -Noah

Forgive me for the poor formatting, but I've edited this post four times, and every time I fix a part, another gets screwed up.

There's tailor which handles some importing from other repos (like cvs) to mercurial, although I have found it a bit tricky to automate. Currently I'm migrating some non os projects over to hg and have both cvs and mercurial in use in the same repository. As long as you commit into cvs (up to the point where the handover happens), then a simple hudson job then merges the changes into the mercurial side. It's a kludge but it works pretty well.