Shrink your HG repository
When I used Subversion and Ant for my projects, I had the habit of committing the required libraries together with the sources. I think that it's a solution that still makes sense with those two tools, as you can checkout a certain version of a project and you have all you need to compile it on the local disk. Things change with Mercurial, since you'll clone the whole history of the project, that is all the versions of the iibraries that have been used in the past, and a Mercurial repo can quickly grow huge in this circumstance. For instance, when I converted the blueMarine repos from Subversion to Mercurial, still using Ant, I got stuff large several hundreds megabytes. This is an annoyance for people that want to quickly clone the repo and try compiling the application. With Maven, of course, the repository is smaller because it doesn't contain the libraries; they will be downloaded as artifacts, but only the specific versions that you need for the current version of the project, not for all the history.
One of the extra advantages of Mercurial (and, generally speaking, I think that the concept applies to Distributed SCMs) is that you get administration utilities for the repository as first class tools. For instance, Mercurial has a command, named 'convert', that allows to convert an existing, local repository from Subversion, Git, Bazaar, others... and Mercurial itself.
What's the point in converting from Mercurial to Mercurlal? It's that you can process flles, for instance dropping or renaming them. In my case, the libraries were committed to the lib directory (and let's also consider a tool directory containing all the extra-Ant building tools that I needed). Yesterday I created a configuration file named filemap:
and then performed the command:
hg convert blueMarine-core/ blueMarine-core-cleanedup --filemap filemap
This created a new repository where all the files in the specified directories have been stripped; it shrank the repo from several hundreds megabytes to just fifty. Then I went to Kenai, scratched and recreated the existing repo and performed a fresh push, that replaced the original contents.
Of course you have to pay two things:
- All the changeset ids have been modified. In Mercurial they are a hash function of the repository contents, so you get the point. You can't refer to arbitrary changeset in the past as they were documented until yesterday. Of course, tags still work; and in any case, if you have to rebuild an untagged past version, you can use the commit date.
- Of course, it is now impossible to perform a build of the old versions of the project that used Ant, since the required libraries have been stripped. On this purpose, I just created a binary bundle of the old repository with the command
hg bundle --all blueMarine-core-repo-archive-20100221_1450.hg
and uploaded the file to the Kenai download area. If one wants to reconstruct an version older than yesterday, he just needs to
hg unbundle http://<url-of-the-archived-repo>
hg update -C <changesetId>
In the end, it's a reasonable trade-off, as it's still possible to reconstruct arbitrarily old versions, and you get a much smaller footprint.