Java. Quality. Metrics
It isn't, perhaps, a secret, that software test development and quality are like a snow ball rolling down a hill: and as it reaches further towards the end of the slope, harder to stop... and think. Think about what is done right; what missed, and how I can do this better, if only I had another chance.
We all have heard (or know from a real experience) about a number of testing types: functional, unit, white box and black box, and so on, and so on. But what really drives most of the test development efforts? Ok, ok, I know - everyone wants to find this ugly bug sitting next to the last one :-) But how we can know if that bug is uglier than others? What is the criteria for this? What tools have to be brought into the process? What additional testing techniques we need to introduce? And last, but not least - where the efforts of test development engineers have to be directed to reduce the cost of quality e.g. more bugs found at the earlier stages of the development?
Now, multiply most of above by a number of platforms, your product is running on and you will start seeing the picture similar to what we see ourselves in the Java Standard Edition Quality Organization.
In the following series of posts I plan to talk about these and some other issues of the software testing and quality measurements. I will share our practices of static analysis (do not confuse it with FindBugs application - static analysis is somehow different from this technique); what stages of test development and test execution we're having in JavaSE production cycle; what tools and techniques we use to increase ROI and free our engineers to do something more sophisticated and exciting than merely manual test execution.
I have to admit, that I really like Test Driven Development (TDD). It sounds so cool to develop all your tests first and then simply make your software to pass these tests. I wish that all problems of the software quality would be that sound and being solved that clearly. Unfortunately, it isn't so. And there is a lot of issues, which couldn't be foreseeing and test framed in advance. And as more complex a system becomes the less efficient this approach will be. As time and development goes, a quality department has to introduce more complicated methodologies and trickier techniques.
To be more specific, I'd like to quickly illustrate my point here: only in Java Virtual Machine (VM) testing we are running more than a million test cases in a few different testing cycles, e.g. nightly, pre-integration, weekly, regression, et cetera. We do separate stress testing and BigApps (or real world applications) testing. We support about two dozen (sic!) platforms. Did I mention, that we walk dogs too?
Now, let me move to the point and talk about real things. No wonder, that to control such a crowd you have to be really creative. And we do. We use some distributed execution environments. One of them is home grown Distributed Tasks Framework (DTF; patents pending) and is 100% pure Java application based on Jini. Another one is Grid Engine from Sun Microsystems. Both of these applications are quite similar in their functionality. However, Grid Engine is officially supported product and I gave up on further DTF maintenance. Now we use it just for scheduling and executing tasks on Wintel platform
Another thing, I'd like to mention here, is multi platform test harness Tonga (patents pending) which supports distributed testing scenarios and can do a lot of other things. I hope to see this great product in open source some day. You can read more about test frameworks and test harnesses at my colleague David Herron's blog
And it is obviously important to have an efficient solution of test results analysis. Ideal system would have low level of false positives; automated detection of regressions and known bugs; run-to-run comparison to find any trends; and many others qualities. Of course, the system like this relies on bugs (or issues) tracking system of some kind. For generic approach you can pick some of well-known systems like Bugzilla. However, larger companies are often go with their own products. Some day I will talk more about result analysis applications we're using in our process.
Measurements of tests effectiveness becomes a high priority issue at the certain point of a product development. And test coverage is one of reasonable ways to measure it. Indeed, it is cool to see that your test coverage had increased by 12% since last release. It makes you proud and confident in the quality of your product. However, it might be not enough to rely on such metric only. E.g. a product's last release includes 27 new features. Overall size of source code introduced is 270K lines. At the same time, the test coverage increased 12% and became 68% overall. Hmm, I wonder if this is good or bad? Or how exactly good is it? Shall we celebrate over such an achievement or do something about it? So, test coverage isn't the single one and shouldn't be treated like the only panacea of software quality measurements. The situation just get's more complex when a product consists of native and Java source code.
On this optimistic note I will close my today's post, so that something else will be left for further discussion.
Sorta disclaimer, though: this is my very first blogging experience ever and so bear with my over enthusiasm and leave me a comment or suggestions of topics you would like to hear or dicuss further.. See you soon!