The Source for Java Technology Collaboration
User: Password:



Konstantin I. Boudnik

Konstantin I. Boudnik's Blog

Java. Quality. Metrics (part 4)

Posted by cos on November 16, 2005 at 01:26 PM | Comments (4)

How are you doing everyone? I hope you're Ok.

Lately, I was organizing and attending an interesting event called Java Days in Saint-Petersburg University. For those who unaware - it's not about Saint-Petersburg, FL, USA - you all know that people in Florida can't count at all, aren't you :-)? It's about Saint-Petersburg, Russia - former capital of the country and the most beautiful city I ever saw in my life. Check out this or this, or this one, or that picture some time - you'll believe me. BTW, they are created by one very lovely and very creative young lady - my daughter Dasha (I know she would hate me for that reference :-).

Any way, let me get back though. As for this event - I liked it. I had a chance to to meet all these young and very bright folks from their Department of Computer Science and Software Engineering; it was an opportunity to be challenged by their questions and all that jazz; I had a chance to reunite with my first professor - Mr. Terekhov, who's leading his own research Institute of Information Technologies. It's so cool to see how IT sector in Russia is hauling forward today. Big thanks for that to all these folks who are teaching new scientists and programmers up there.
Ah, well - that would be a topic for another flame some other day. I have to write something you all are getting here for - software quality.

So, we can use a few methods to insure our product quality and guarantee that we're digging at least in right direction.

Another one, I was about to mention, is code coverage. Being a quite simple thing it might give you a rough understanding of where you're at this moment of product and quality development. To make the story shorter: you can somehow instrument your code to report on any execution of code's methods (you might want finer granularity, but let's not talk about this now). Instrumentation of a native code might be tricky sometimes and it requires creating of new binaries for code coverage measurements. Java code's instrumentation isn't that hard and might be accomplished "on a fly". Depending on a framework, instrumented code's output might be as simple as follows:
Method getListHeadPtr() is began
Method getListHeadPtr() is completed
or as weird as this
CLASS: foo/bar/Shift$Stuff []
SRCFILE: Shiftstuff.java
TIMESTAMP: 1131040951709
DATA: C
#kind	start	end		count
METHOD: ()V [private]
1	62464	0		2
3	62464	0		2
METHOD: (Lfoo/bar/Shiftstuff$1;)V []
1	62464	0		2
3	62464	0		2
Then you can process this input in some manner and create a variety of report out of this data. The only reason to have them is to estimate how good the source code is covered by your tests. In other words, you can roughly estimate the amount of exercise you're giving to your code. There's no special magic behind it. It's one of the primitive quantitative measures of the quality control. Well, it's not as primitive as the count of kilo-lines of codes or similar, but nevertheless. However, different organizations have different standards on this matter. Perhaps, say 70% would sound reasonable from some "common" sense point of view, wouldn't it? I'd say yes it is really sound figure. Would you expect to see 90 or even 100%? It sounds really cool, right? Well, it depends. Just one interesting observation: after about 65-75% of code coverage the complexity of growing it any further starts being almost asymptotic. I meant to say, that one can't really get 100% of their source code covered by the tests. Or at least not in a reasonable time/money frame. In other words, you can do it with ROI close to zero or even negative perhaps.
And using the principle of reasonable quality, you might not want to get that high marks. The level of 80% or even 70% might be a way sufficient. But how would you know? Don't ask me, because the answer is somewhere beyond that technology's boundaries. I will talk about this next time.

Yet another approach of making improvements in this field is static analysis. This again popular thing has been mixed lately with pattern analysis (like many of modern IDEs or standalone FindBugs application do). But I'm talking about the old-fashion one - dealing with control and data flow analysis. Here's a list of such tools. However, I'm not suggesting you to use them or something. I just googled this stuff.

So, using this complicated yet powerful technique one can do many interesting things. E.g. you can find a dead spots in your code. Good one, right? It helps to keep a code free of methods made "...just for future use..." and never been used at all. Thus, it increases the pureness of the code and left less spots for bugs to hide.
Or you can analyze which parts of your code are likely to get most attention, because a lot of others places in the code are using/calling them. Alas, you might want to pay more attention to these spots merely to insure they are bugs free.

And obviously you can't do much without some special tools. Just generic control flow graph's generator won't help you - what would you do with 40+ thousand nodes' graph? Print it all along and wrap around your office? Neat, isn't it? But you might want to do a little bit more useful stuff. Like analyze a parity of memory allocation/deallocation operations, opening/closing streams, etc. Unfortunately, I wasn't been able to find any free tools to suite such needs. And I don't want to do any commercials here. So, be my guests and find it if needed (please also drop me a note, will you?). I believe that noticeable software companies have to invest in their internal tools no matter what. It's simply not possible to by some stuff off the shelves.

Kinda hint again: I'll continue on this topic next time. Stay tuned...

Cos

Bookmark blog post: del.icio.us del.icio.us Digg Digg DZone DZone Furl Furl Reddit Reddit
Comments
Comments are listed in date ascending order (oldest first) | Post Comment

  • Interesting that few days ago I've started discussion about achievable coverage level in junit mailing list. I still have feeling that you can get 100% coverage (decision based) but most likely it will require changes in the application code to make it easier to test. Then you can use junit-based test suite to fail uncovered code. See http://groups.yahoo.com/group/junit/message/15430

    By the way, FindBugs actually does data flow analysis and can track down unclosed streams, jdbc connections, etc. If I am not mistaken even between the methods.

    Posted by: euxx on November 16, 2005 at 01:57 PM

  • Hey Eugene.

    Nice to hear from you again. As for your point of 100% coverage - why not. My concern is if you really need it?
    And unit testing has some limitations, so it makes solution even harder in some way.

    As for FindBugs - I might be wrong. My impession is based on the state of the application I got some time ago - it could be not up to date now. So, good for them :-)

    Posted by: cos on November 16, 2005 at 02:04 PM

  • Hi cos, see ODR and OS bug detectors at http://findbugs.sourceforge.net/bugDescriptions.html

    Its been pointed out that it actually depends from the definition of 100% coverage. For instance "too simple" code no need to cover (e.g. getters, seteers and similar constructs).

    However my major concern was about potentially buggy new code that could be added to bugfree codebase and not immediately covered by the test case. This even more risky if new additions introduces new decision branches in the code. Failed coverage test would immediately raise a red flag in this case.

    PS: say hi to the ex gang from me when you'll next visit St.Petersburg. :-)

    Posted by: euxx on November 16, 2005 at 04:16 PM

  • I agree with that new code thing - it should work.

    Sure will bring your regards to them
    Email me some time at kboudnik@gmail.com - we'll chat.

    Posted by: cos on November 16, 2005 at 04:28 PM





Powered by
Movable Type 3.01D
 Feed java.net RSS Feeds