Skip to main content

Java. Quality. Metrics (part 6)

Posted by cos on March 15, 2006 at 8:20 PM EST
Hi there.

In this short article I'll try to summarize what I was discussing for the last couple of months.

So, let's briefly list key factors that are likely to affect our judgment of software quality. - our code quality expectations (good enough quality, remember?) - coverage isn't everything - code complexity and a frequency of the changes - number of bugs filed against source code modules/files - testing methodologies

Alas, the last one doesn't sound like a beast, it might reduce the effectiveness of defects discovery rate a lot. Obviously, it is a choice of approaches of test failures analysis. The bulky one with a weak algorithm of false positives detection pisses off engineers and they begin ignoring most possibly important warnings and notifications.

Anyways, I want to talk about a combination of the first three bullets above.

About a year ago, a few of Sun's fellas were chatting about simpler ways of delivering a better code. Static analysis and variety of testing approaches were among the things on the table. At some point, the bright idea of mixing both of those and adding some other flavors had appeared. Afterwards, we came up with what was called Buggy Spots Prediction (BSP).

The idea itself is as follows:
  • we're creating a static call graph (CFG) for any given source code, using a commercial or home-grown tools
  • having this, we can calculate a few things about this graph, e.g. the frequency of calls to any particular method; the frequency of calls from a method to other methods with-in the code; basic-block based complexity of a code, etc. (currently, we calculate about five or seven of them, i.e. coverage per function, basic block per function, et cetera)
  • when executing tests against the instrumented build, we can prepare a code coverage metric for it
  • combine these two lists of modules - from CFG and from code coverage runs - by module names
  • sort the resulting list ascending by in-call frequency and descending by coverage scores
  • let's assume that most frequently called functions are, perhaps, most important from the quality standpoint. Well, their code is called more often, so any problems will immediately affect a top-level or at least quite important functionality.
  • if such methods are having low coverage numbers and high complexity or high number of reported bugs, then it might be a good indication that the code has to be targeted by quality engineers and/or developers.
So, all that is giving you a way of quickly selecting possibly buggy spots, e.g. the pieces of the source code which are likely to become a root of defects found by your customers. Why? Well, simply because of the fact that the coverage is low in these areas and existing tests aren't guaranteed an acceptable level of quality.

As any heuristic approach, this one might produce incorrect results. However, our preliminary predictions are quite coherent to the fact that most of externally reported defects were found in the poorly covered but frequently called methods.

In organization with limited QE resource, a manager might want to firstly address such hot spots. This will help achieve a good-enough quality level and then concentrate on less important issues.

Yet another benefit is that the technique is a language independent. Once you'll build a universal presentation for CFG and code coverage information, you can use the same engine to measure Java, C++, and programs written in other languages.

And, of course, our methodology doesn't replace a human knowledge of the importance of product features. It helps engineers see a valuable projection of static-to-runtime boundaries and helps focus on some aspects of that complex matter.

And I just want to remind to you about Project Mustang (Java6) Regressions Challenge. Please check here for more details.

CU,
Cos
Related Topics >>