Instantly turning your Hudson cluster into a Hadoop cluster
Here at my work, I take care of a 30-40 node Hudson cluster for our group. This is probably a relatively bigger Hudson cluster, but I know people out there do set up Hudson clusters of various sizes.
Hudson cluster is used for doing builds, obviously, but I've been thinking it would be nice if this cluster becomes multi-purpose, because there are a lot of things we could do better if we have a lot of computing resources in a more accessible fashion, and setting up multiple different clusters each for a different framework is tedious.
So over the past 2 weekends, I've worked on a hobby project, which lets you turn your Hudson cluster into a Hadoop cluster.
The idea is simple — Hudson knows the shape of its cluster, so why don't we let it start Hadoop JVM on all the nodes, and hook them all together? Hudson could also install Hadoop binaries on all the nodes as necessary, really making this solution a turn-key.
In this way, you can simplify your Hadoop installation drastically; all you need is, (1) go to Hudson plugin update center, (2) install a Hadoop plugin, and (3) restart Hudson. When Hudson comes back, you have a Hadoop cluster.
My initial motivation was to use Hadoop for analyzing access logs of java.net, but eventually, I think I could use this for Hudson itself, too. Imagine just persisting lots of lots of test results, and doing some data analysis on it in a mass scale. How about using Hadoop for storing old artifacts, so that you can utilize the combined storage of a cluster? Or how about extension of JUnit for distributing tests across a Hadoop cluster?