Skip to main content

Java. Quality. Metrics (part 6)

Posted by cos on March 15, 2006 at 5:20 PM PST

Hi there.

In this short article I'll try to summarize what I was discussing for
the last couple of months.

So, let's briefly list key factors that are likely to affect our
judgment of software quality.

- our code quality expectations (good enough quality, remember?)
- coverage isn't everything
- code complexity and a frequency of the changes
- number of bugs filed against source code modules/files
- testing methodologies

Alas, the last one doesn't sound like a beast, it might reduce the
effectiveness of defects discovery rate a lot. Obviously, it is a
choice of approaches of test failures analysis. The bulky one with a
weak algorithm of false positives detection pisses off engineers and
they begin ignoring most possibly important warnings and

Anyways, I want to talk about a combination of the first three bullets

About a year ago, a few of Sun's fellas were chatting about simpler
ways of delivering a better code. Static analysis and variety of
testing approaches were among the things on the table. At some point,
the bright idea of mixing both of those and adding some other flavors
had appeared. Afterwards, we came up with what was called Buggy Spots
Prediction (BSP).

The idea itself is as follows:

  • we're creating a static call graph (CFG) for any given source
    code, using a commercial or home-grown tools
  • having this, we can calculate a few things about this graph,
    e.g. the frequency of calls to any particular method; the
    frequency of calls from a method to other methods with-in the
    code; basic-block based complexity of a code, etc. (currently, we
    calculate about five or seven of them, i.e. coverage per function,
    basic block per function, et cetera)
  • when executing tests against the instrumented build, we can
    prepare a code coverage metric for it
  • combine these two lists of modules - from CFG and from code
    coverage runs - by module names
  • sort the resulting list ascending by in-call frequency and
    descending by coverage scores
  • let's assume that most frequently called functions are, perhaps,
    most important from the quality standpoint. Well, their code is
    called more often, so any problems will immediately affect
    a top-level or at least quite important functionality.
  • if such methods are having low coverage numbers and high
    complexity or high number of reported bugs, then it might be a
    good indication that the code has to be targeted by quality
    engineers and/or developers.

So, all that is giving you a way of quickly selecting possibly buggy
spots, e.g. the pieces of the source code which are likely to become a
root of defects found by your customers. Why? Well, simply
because of the fact that the coverage is low in these areas and
existing tests aren't guaranteed an acceptable level of quality.

As any heuristic approach, this one might produce incorrect
results. However, our preliminary predictions are quite coherent to
the fact that most of externally reported defects were found in the
poorly covered but frequently called methods.

In organization with limited QE resource, a manager might want to
firstly address such hot spots. This will help achieve a good-enough
quality level and then concentrate on less important issues.

Yet another benefit is that the technique is a language
independent. Once you'll build a universal presentation for CFG and
code coverage information, you can use the same engine to measure
Java, C++, and programs written in other languages.

And, of course, our methodology doesn't replace a human knowledge of
the importance of product features. It helps engineers see a valuable
projection of static-to-runtime boundaries and helps focus on some
aspects of that complex matter.

And I just want to remind to you about Project Mustang (Java6)
Regressions Challenge. Please check href="">here for
more details.



Related Topics >>