Skip to main content

JSR 292 Goodness: Fast code coverage tool in less than 10k

Posted by forax on February 12, 2011 at 5:51 AM PST

JSR 292 introduces a new bytecode instruction invokedynamic but also several new kind of constant pool constants. Which means that most of the tools that parse bytecodes like ASM, BCEL, findbugs or EMMA will need to be updated to be java 7 compatible.
EMMA is a code coverage tool, a tool that helps developers to know if their tests cover all the code of the application. While it's not the only code coverage tool available in Java, it's the most popular from my personal experience.
In this blog entry, I would like to show how to write a simple code coverage tool indycov that use JSR 292 API to have a runtime overhead close to zero.

How a code coverage tool works ?

A code coverage tool records all paths taken when running the application and checks at the end if all lines of codes was recorded.
By example, if I run the code below with no argument, it will print "foo" and "bar" and the code coverage tool will say that the else branch that prints "baz" will be not covered.

public static void main(String[] args) {
    System.out.println("foo");
    if (args.length == 0) {
      System.out.println("bar");
    } else {
      System.out.println("baz");
    }
  }

To record if an instruction was executed or not, code coverage tools add a probe which is a small amount of bytecodes that will call the runtime of the tool to say: "I have visited this instruction".
In fact, tools, only add probes where necessary, at the begining of each basic block of the control flow. A basic block is a collection of instructions without any jump (return, thow, if, break etc).
By example, the code above has 4 basic blocks: the once printing "foo", the one printing "bar", the one printing "baz" and the one containing the return at the end of the method.

So a code coverge tool is a tool that find basic blocks and add probes at the begining of each one.

Using JSR 292 API to implement a code coverage tool.

Finding basic block is easy with bytecode that come from 1.6 or 1.7 compiler because the compiler is required to add stack maps
in the bytecode flow. Stack maps are used to verify the bytecode in linear time and are inserted at the join points of the control flow.
So finding basic block in a 1.7 compatible bytecode can be done in one pass thanks to the stack maps info inserted by the compiler.

All existing code coverage tools have an impact on the performance of the application because the code of the probe is executed each time you call a basic block even if it should be executed once.
If you are a regular reader of this blog, you already know how to create a probe that will be executed once. The trick is use use invokedynamic, to record the visit in the bootstrap method and
to use a target method handle that is equivalent to no-op. So subsequent call will not execute any code.

main([Ljava/lang/String;)V
    INVOKEDYNAMIC probe ()V [fr/umlv/indycov/RT#bsm, 1]
    GETSTATIC java/lang/System.out : Ljava/io/PrintStream;
    LDC "foo"
    INVOKEVIRTUAL java/io/PrintStream.println (Ljava/lang/String;)V
    ALOAD 0
    ARRAYLENGTH
    IFNE L0
    INVOKEDYNAMIC probe ()V [fr/umlv/indycov/RT#bsm, 2]
    GETSTATIC java/lang/System.out : Ljava/io/PrintStream;
    LDC "bar"
    INVOKEVIRTUAL java/io/PrintStream.println (Ljava/lang/String;)V
    GOTO L1
    L0
    FRAME SAME
    INVOKEDYNAMIC probe ()V [fr/umlv/indycov/RT#bsm, 3]
    GETSTATIC java/lang/System.out : Ljava/io/PrintStream;
    LDC "baz"
    INVOKEVIRTUAL java/io/PrintStream.println (Ljava/lang/String;)V
    L1
    FRAME SAME
    INVOKEDYNAMIC probe ()V [fr/umlv/indycov/RT#bsm, 4]
    RETURN

A no-op, is a method handle that takes nothing and return void. This method handle can be retrieved with Methodhandles.identity(void.class).
So the bootstrap method is the following. The first line records that the basic block with number 'index' is visited.

  public static CallSite bsm(Lookup lookup, String name, MethodType type, Object index) {
    classValue.get(lookup.lookupClass()).cover((Integer)index);
    return new ConstantCallSite(MethodHandles.identity(void.class));
  }

The code of the prototype is freely available (as attachment of this blog)  and works like an agent.
It relies on ASM 4 (still in beta) to do the bytecode transformation.

Side note: This prototype doesn't handle runtime exception correctly. By example, if a NPE is thrown, it will escape from the basic block without ending it.
How to modify the prototype to take care of exception is let to interrested readers.

Running the code with one argument "foo"

  java -XX:+UnlockExperimentalVMOptions -XX:+EnableInvokeDynamic -javaagent:lib/indycov.jar -cp test-classes/ TestCoverage foo

will print

  foo
  baz
  TestCoverage: no coverage for line(s) 2 to 2
  TestCoverage: no coverage for line(s) 5 to 6

line 2 is the declaration of the class, it's because javac adds a default constructor which is not used.
lines 5 to 6 are the ones that print "bar".

If you want to play with it don't forget to compile your sources with the debug flag on. Otherwise, the generated bytecodes will not contain mapping information between opcodes and line numbers.

Cheers,
Rémi

Related Topics >>

Comments

Hello Rémi, This is some really interesting stuff. ...

Hello Rémi,

This is some really interesting stuff. Do you know about the Jacoco project. It is the successor of Emma and is currently under development. It is already very stable and we use it here for our code coverage statistics. I do not know how Jacoco works internally but they might be interested in your ideas to optimize it for Java 7.

Thanks for your great blog posts,

- Bernd Rosstauscher

JSR 292 Goodness: Fast code

Rémi,
This is great stuff, thank you. I like this post for a few reasons:
1. I've been a happy Emma user for years, but it seemed a bit crufty as a project and I worried that it might crash on the rocks of JDK 7. No more worries!
2. This is a great example of what is possible with INVOKEDYNAMIC, complete with source code.
One request. There doesn't seem to be any licensing statement in the source code. Can you give permission for others to copy and use your code? It's implicit in what you write, and I would have no problem using it now on a personal project. However, if someone wanted to use this code to start a project, or to include as part of an existing project, they would probably appriciate a statement of what, if any, conditions there are on using your code.
Thanks again!
regards,
-Frank Hardisty
http://www.personal.psu.edu/fah109/