The Source for Java Technology Collaboration
User: Password:



Eamonn McManus

Eamonn McManus's Blog

A helper class for performance statistics

Posted by emcmanus on November 23, 2006 at 03:31 AM | Comments (4)

I recently wanted to add some performance measurements to an application. To avoid duplicating code everywhere I needed to make measurements, I coded up a small helper class.

What I wanted to measure was the amount of time taken by various operations within the application. For each operation, I wanted to know various statistics about it: how many times it was called, the average time it took, and also the minimum, maximum, and standard deviation.

The basic skeleton for timing code is this:

    long startTime = System.nanoTime();
    doTheOperation();
    long elapsedTime = System.nanoTime() - startTime;

The System.nanoTime() method was introduced in the J2SE 5.0 release. Its intended use is exactly for timing measurements such as this. As the name implies, the result is in nanoseconds, though it will not usually be accurate to the nearest nanosecond.

If we want to accumulate statistics about doTheOperation(), then we'll need more code than that. It'll be more like:

    long startTime = System.nanoTime();
    doTheOperation();
    long elapsedTime = System.nanoTime() - startTime;
    synchronized (something) {
        nCalls++;
        sumTime += elapsedTime;
    	minTime = Math.min(minTime, elapsedTime);
        maxTime = Math.max(maxTime, elapsedTime);
    }

Not only does this pollute the doTheOperation() call with lots of distracting code, but we'll also be obliged to duplicate code around every other operation we want to measure. So obviously we'll want to pull this code out into a class. We'd like the measurement code to look something like this:

public class ClassContainingTheOperationToBeMeasured {
    private final CallStats operationStats = new CallStats();

    ...
        CallStats.Call call = operationStats.newCall();
        doTheOperation();
        call.end();
    ...
}

For each collection of timing statistics, we'll need an instance of CallStats to record the statistics. The use of this instance is just the line before and the line after doTheOperation(). We probably can't simplify more than that.

So here's the class I coded for this:

import java.beans.ConstructorProperties;
import java.io.Serializable;

public class CallStats implements Serializable {
    private int nCalls;
    private long minTime;
    private long maxTime;
    private long sumTime;
    private double sumSquaredTime;
    
    /** Creates a new instance of CallStats */
    public CallStats() {
    }
    
    @ConstructorProperties({"nCalls", "minTime", "maxTime", "sumTime",
    	    	    	    "sumSquaredTime"})
    public CallStats(int nCalls, long minTime, long maxTime, long sumTime,
    	    	     double sumSquaredTime) {
        this.nCalls = nCalls;
        this.minTime = minTime;
        this.maxTime = maxTime;
        this.sumTime = sumTime;
        this.sumSquaredTime = sumSquaredTime;
    }
    
    public int getNCalls() {
        return nCalls;
    }
    
    public long getMinTime() {
        return Math.max(0, minTime);
    }
    
    public long getMaxTime() {
        return maxTime;
    }
    
    public long getSumTime() {
        return sumTime;
    }
    
    public double getSumSquaredTime() {
        return sumSquaredTime;
    }
    
    public long getMeanTime() {
        if (nCalls == 0)
            return 0;
        else
            return sumTime / nCalls;
    }
    
    public long getStdDevTime() {
        if (nCalls == 0)
            return 0;
        double mean = getMeanTime();
        double meanSquared = mean * mean;
        double meanOfSquares = sumSquaredTime / nCalls;
        double variance = meanOfSquares - meanSquared;
        if (variance < 0)
            return 0;  // might happen due to rounding error
        return (long) Math.sqrt(variance);
    }
    
    public String toString() {
        return "nCalls = " + nCalls + ";\n" +
                "sumTime = " + getSumTime() + ";\n" +
                "minTime = " + minTime + ";\n" +
                "maxTime = " + maxTime + ";\n" +
                "meanTime = " + getMeanTime() + ";\n" +
                "stdDevTime = " + getStdDevTime();
    }
    
    public Call newCall() {
        return new Call();
    }
    
    public class Call {
        private final long startTime = System.nanoTime();
        private Call() {}
        
        public void end() {
            long elapsed = System.nanoTime() - startTime;
            synchronized (CallStats.this) {
                nCalls++;
                if (elapsed < minTime || nCalls == 1)
                    minTime = elapsed;
                if (elapsed > maxTime)
                    maxTime = elapsed;
                sumTime += elapsed;
                double elapsedFP = elapsed;
                sumSquaredTime += elapsedFP * elapsedFP;
            }
        }
    }
}

A few remarks are in order. First, the @ConstructorProperties annotation is new in the Java SE 6 release, so if you're using an earlier release you'll need to comment out that line. The purpose of the annotation is to allow us to use CallStats directly as the type of an attribute in an MXBean, like this:

public interface MeasurementMXBean {
    public CallStats getFooStats();
    public CallStats getBarStats();
}

The presence of the @ConstructorProperties annotation allows us to reconstruct a CallStats instance for an MXBean proxy, like this:

    MeasurementMXBean proxy =
    	JMX.newMXBeanProxy(mbeanServer, name, MeasureMXBean.class);
    CallStats stats = proxy.getFooStats();

Second, there are a lot of nanoseconds in a second. So we have to be on the lookout for overflow. A long is enough to hold 292 years' worth of nanoseconds without overflowing, so you won't usually have problems if you use long everywhere. But we're also accumulating squared nanoseconds so that we can compute the standard deviation. We would get overflow if we squared any value over 3 seconds' worth of nanoseconds as a long. So instead we square the values as doubles and also sum them as doubles. This means the standard deviation might not be as accurate as if we had used BigInteger, but I'm not sure it matters. I think there's a small possibility that rounding errors would lead us to compute a negative variance when the actual variance is very close to zero, so I've guarded against that case.

Finally, you might have noticed an asymmetry between the minTime and maxTime statistics. minTime needs a special check for the first call, because otherwise the initial default value of 0 would never change. maxTime doesn't, because the first measurement will be greater than the initial zero value. We could correct the asymmetry by initializing minTime to Long.MAX_VALUE, but at the expense of reporting a very odd minTime for a CallStats that hasn't recorded any measurements yet.


Bookmark blog post: del.icio.us del.icio.us Digg Digg DZone DZone Furl Furl Reddit Reddit
Comments
Comments are listed in date ascending order (oldest first) | Post Comment

  • Considered JAMon?

    Posted by: diathesis on November 23, 2006 at 04:41 PM

  • +1 on JAMon

    Kind regards,
    Kirk Pepperdine

    Posted by: kcpeppe on November 23, 2006 at 11:01 PM

  • +1 on JAMoN.
    This approach is the basis of my own Java EE monitoring utility: MessAdmin, the difference being that I don't measure method invocation times but more "macro" statistics, and that I am totally non-intrusive! Feel free to have a look: it is fully open-source (BSD license), and there is a live demo on the project web site.
    Cédrik

    Posted by: applebanana8 on November 28, 2006 at 06:48 AM

  • It should be noted that none of the tools promoted by Kirk and CedriK would be considered useful within the context of professional Software Performance Engineering (SPE) as the do not address many aspects in determining a performance model that could be useful across various phases of the application life cycle. These tools have very little contextual information (resource consumption, JVM events (GC), thread monitor contention, transactional context, (dist) trace context, call stacks, system/component runtime state, metrics,...) that would be useful in resolving real-world performance problems that could not be easily solved by merely looking at the code itself.

    Posted by: wlouth on December 21, 2006 at 03:21 AM



Only logged in users may post comments. Login Here.


Powered by
Movable Type 3.01D
 Feed java.net RSS Feeds