Skip to main content

Testing Command-Line Applications using Golden-File approach

Posted by alexeyp on February 17, 2007 at 12:39 PM PST

This article is addressed to those interested in JavaTest TM
harness and its open source version href="">JT harness.
Its goal is to serve as an example of using JavaTest harness outside of
the world of href="">productized
test suites. 

The story describes our experience of using JavaTest-style
test format
and JavaTest itself as a test harness for functional and regression
tests. The tested item is a command-line application.


The specific command-line application being tested is ApiCover, one of
the components provided to JCP members in the href="">JCTT
package. Briefly, it takes a description of Java TM
API to test
and pointer to the test
classes and calculates estimation of the test coverage. The tool
supports multiple

The most frequently observed approach to functional test
development for such simple cases is
to have a set of shell scripts placed into a directory hierarchy. Each
of these scripts performs one simple check and reports results to the
console. Tests are run by a trivial test harness, usually another shell
script, that performs test iteration and joint status reporting. Test
specifications usually come separately from tests.

How it was done here

In case of ApiCover the test suite was JavaTest-based. The choice was
quite natural, given the long history of using JavaTest for testing of
command-line tools, like Java compiler or
CLDC preverifier.

Every individual
test description in this test suite contains information on how to
launch the
tool and how its output should be verified. All tests in this test
suite are written using the "golden-file" approach. The golden data is:

  • application output streams both stderr and stdout
  • exit code
  • set of output files

The test suite execution is done using JavaTest harness and set of
trivial scripts, responsible for launching the application under test,
capturing and storing its output and
comparing the results with stored data. The main idea is to make it
simple to launch ApiCover in multiple modes and verify the result.
Result verification may be textual diff or xml validation, data storing
is optional.

The test development scenario for the new test is as follows:

  • Create test description, including:
    • test case specification (textual, optional)
    • parameters, comparison procedure to use
  • Run test in the 'setup' mode
  • Validate output and mark it as 'golden file'

When application is changed and test starts to fail, validate
the new output and mark it as 'golden file' if appropriate.

Test Description example:


This test verifies that tool includes all
public/protected class
members into the basic report if -detail option specified by '4' value
and -format value is 'xml'.

Test Descriptions

Test cases included:


title report000
executeArgs -apiinclude testapi -tck tcks/all/classes -api
-detail 4 -format xml
keywords runtime
executeClass diff

What it means:

  • Description
    - Contains verbal Test Case Specification
  • The table
    describes actual test cases that will
    be executed. Description is done in the language of JavaTest's HTMLTestFinder.
    Below you can see how these specific descriptions are interpreted by
    the execution Script, which used for this
    test suite:

    title Unique ID of the test case
    executeArgs Command-line arguments to pass to the
    keywords This field is used by main test filtering engine
    of JavaTest harness. Not used in this test suite.
    executeClass Which of predefined golden file comparators to use
    for this test

Pros and Cons 


  • All traditional prerequisites come from using a
    specialized test harness, like test execution and test result
    management, test specification, source, and test result browsing,
    parallel test execution, test exclusion and filtering, multiple
    environments support (JDK version, for example)
    etc etc. 
  • We got highly automated and maintainable test suite,
    preserved high consistency through all tests
  • Test development is fast and simple
  • Application can be tested at multiple platforms by
    design. When testing Java applications
    that need to be verified in multiple environments, it is critical to
    avoid a dependency on a specific scripting language used for test
  • We avoided information duplication. The test code is linked
    to the  test
    specification and a test report


  • Good test harness is a complex tool that takes time to
    study. Infrastructure may need to be adjusted for it.
  • The "golden file" approach is naturally limited, can not be
    enough to cover everything
  • When output is unstable, this approach may not work or may
    sophisticated comparators. The most obvious example is in processing

As usual, almost all of pros have drawbacks in some situations,
one size
never fits all.

Related Topics >>