Ohloh's open source project statistics - WTF?

Posted by timboudreau on May 26, 2009 at 8:56 PM PDT

Ohloh is a neat service. It does some basic statistical analysis of open source projects, and tries to come up with useful information. But it sure comes up with some wacky statistics.

Take, for example, the Wizard project. Now, this is something that I initially whipped up in two afternoons in September '05. It's a small library - it's attracted six contributors since then, who've all made valuable contributions. I tend to do work on it sporadically, for a couple days every 4-6 months. At most I've put 2 weeks work into it. With contributors, the pattern is that it's people who are using the library and want a feature that isn't there. They implement it, and get on with doing what they wanted to use the library for, which is as it should be. If somebody wanted to take on an ongoing role, that would be great, but I'm not holding my breath.

According to Ohloh, this weekend project represents (!)

  • 8 person-years of work
  • More than 1 million dollars of at something resembling my current salary (deliborate obfuscation) - but even more interestingly, only half that if you leave out the documentation :-)

Now, I look at my profile, and my work on NetBeans - I'll have been working on NetBeans for 10 years of my life in another month (eek!).
What does Ohloh say about that? I have

  • Made 3 (!!) commits since 1999
  • My primary language is HTML (!!!)

God help me if I were to apply for a job and someone actually believes that I all I managed to do in ten years was write 3 html pages!

Separately, I take issue with the notion that Decreasing year-over-year development activity deserves this warning icon - for example, it is displayed on Ohloh's page for my Color Chooser project.

I mean, think about it: Especially with small projects (IMO in a perfect world, most library projects should be small and tackle one thing well), a positive goal is for the project to reach a steady state where it does not need a great deal of, or ideally any, maintenance. If software rotted like fruit, there wouldn't be any progress in the software industry. There is such a thing as a library that can be finished!

What was that quote about lies, damned lies and statistics?

David: I figured the switch from CVS to Hg had something to do with it - it's not surprising. But a site that posts data about people's behavior and work history, and even if it does not claim to be authoritative, is obviously going to be read and considered and potentially have a profound effect on the careers of people whose work is represented there needs to be a little more responsible. I would have more trust in a site that says "We try really hard to tell teh truth here. For this project, our data is probably wrong. We just don't have enough info now. Come back in a few days and we'll have it fixed" than in a site that posts verifiably dead-wrong info as if it were authoritative. Aside from the fact that in the U.S., it's a lawsuit waiting to happen, it's just a really short-sighted way to do business. I do hope they get their Hg s**t together, because I don't forever want to be the guy who spent ten years writing 3 html pages, and the idea that there is a service that helps you evaluate potential employees that says that is appalling. Too many job candidates are evaluated by people who have no idea what the job actually is (like looking for a Java developer with 10 years experience in 1997 when Java had existed for +/- 3 years). Without truth-in-advertising, this sort of thing just makes our industry more unproductive.

ROFL, my primary language is XML !!!

The statistics for NetBeans is screwed on Ohloh since I have removed the CVS listings and put there the hg repos. They haven't been able to parse the hg log yet for that project. Please note that they list negative number of lines of code for NetBeans.