Java. Quality. Metrics (part 4)
How are you doing everyone? I hope you're Ok.
Lately, I was organizing and attending an interesting event called
Java Days in Saint-Petersburg University. For those who unaware - it's
not about Saint-Petersburg, FL, USA - you all know that people in
Florida can't count at all, aren't you :-)? It's about href="http://www.spb.ru">Saint-Petersburg, Russia - former
capital of the country and the most beautiful city I ever saw in my
life. Check out
or this, or this one, or that picture some time - you'll believe me. BTW, they are created by one very lovely and very creative young lady - my daughter Dasha (I know she would hate me for that reference :-).
Any way, let me get back though. As for this event - I liked it. I had a
chance to to meet all these young and very bright folks from their
Department of Computer
Science and Software
Engineering; it was an opportunity to be challenged by their
questions and all that jazz; I had a chance to reunite with my first professor -
Mr. Terekhov, who's leading his own research href="http://www.iti.spbu.ru/eng/default.asp">Institute of Information
Technologies. It's so cool to see how IT sector in
Russia is hauling forward today. Big thanks for that to all these
folks who are teaching new scientists and programmers up there.
Ah, well - that would be a topic for another flame some other day. I
have to write something you all are getting here for - software
So, we can use a few methods to insure our product quality and
guarantee that we're digging at least in right direction.
Another one, I was about to mention, is
coverage. Being a quite simple thing it might give you a rough
understanding of where you're at this moment of product and quality
development. To make the story shorter: you can somehow instrument
your code to report on any execution of code's methods (you might want
finer granularity, but let's not talk about this now). Instrumentation
of a native code might be tricky sometimes and it requires creating of
new binaries for code coverage measurements. Java code's
instrumentation isn't that hard and might be accomplished "on a
fly". Depending on a framework, instrumented code's output
might be as simple as follows:
Method getListHeadPtr() is began
Method getListHeadPtr() is completed
or as weird as this
CLASS: foo/bar/Shift$Stuff 
#kind start end count
1 62464 0 2
3 62464 0 2
1 62464 0 2
3 62464 0 2
Then you can process this input in some manner and create a variety of
report out of this data. The only reason to have them is to estimate
how good the source code is covered by your tests. In other words,
you can roughly estimate the amount of exercise you're giving to your
code. There's no special magic behind it. It's one of the primitive
quantitative measures of the quality control. Well, it's not as
primitive as the count of kilo-lines of codes or similar, but
nevertheless. However, different organizations have different
standards on this matter. Perhaps, say 70% would sound reasonable from
some "common" sense point of view, wouldn't it? I'd say yes it is
really sound figure. Would you expect to see 90 or even 100%? It
sounds really cool, right? Well, it depends. Just one interesting
observation: after about 65-75% of code coverage the complexity of
growing it any further starts being almost asymptotic. I meant to say,
that one can't really get 100% of their source code covered by the
tests. Or at least not in a reasonable time/money frame. In other words,
you can do it with ROI close to zero or even negative perhaps.
And using the principle of href="http://weblogs.java.net/blog/cos/archive/2005/10/java_quality_me_2.html">reasonable
quality, you might not want to get that high marks. The level of
80% or even 70% might be a way sufficient. But how would you know?
Don't ask me, because the answer is somewhere beyond that technology's boundaries. I will talk about this next time.
Yet another approach of making improvements in this field is static
analysis. This again popular thing has been mixed lately with pattern
analysis (like many of modern IDEs or standalone FindBugs application do). But I'm talking about the old-fashion one - dealing with
control and data flow analysis. Here's a list href="http://www.testingfaqs.org/t-static.html">of such
tools. However, I'm not suggesting you to use them or something. I just googled this stuff.
So, using this complicated yet powerful technique one can do many interesting things. E.g. you can find a dead spots in your code. Good one, right? It helps to keep a code free of methods made "...just for future use..." and never been used at all. Thus, it increases
the pureness of the code and left less spots for bugs to hide.
Or you can analyze which parts of your code are likely to get most attention, because a lot of others places in the code are using/calling them. Alas, you might
want to pay more attention to these spots merely to insure they are bugs free.
And obviously you can't do much without some special tools. Just generic control flow graph's generator won't help you - what would you do with 40+ thousand nodes' graph? Print it all along and wrap around your office? Neat, isn't it? But you might want to do a little bit more useful stuff. Like analyze a parity of memory allocation/deallocation operations, opening/closing streams, etc. Unfortunately, I wasn't been able to find any free tools to suite such needs. And I don't want to do any commercials here. So, be my guests and find it if needed (please also drop me a note, will you?). I believe that noticeable software companies have to invest in their internal tools no matter what. It's simply not possible to by some stuff off the shelves.
Kinda hint again: I'll continue on this topic next time. Stay tuned...