The Source for Java Technology Collaboration
User: Password:



Kohsuke Kawaguchi

Kohsuke Kawaguchi's Blog

Parsing command line options in JDK 5.0 style: args4j

Posted by kohsuke on May 11, 2005 at 12:20 AM | Comments (25)

Parsing command line options in your program has always been a boring work; you loop through String[] and write a whole bunch of arg.equals("-foo") and arg.equals("-bar"). There are some libraries that attempt to solve this, such as Apache Commons CLI. I tried many of those, but I didn't quite like any of those. I felt that I can write a better one by taking advantanges of JDK 5.0 features. That eventually became args4j.

With args4j, you first write a Java class that represents all the options that you are going to define. I call this class an option bean (although it doesn't have to be a Java Bean), and it can look like this:

public class MyOptions {
    
    private boolean recursive;

    private File out;

    private String str = "(default value)";

    private int num;
}

You then put args4j annotations on this class. The annotations tell args4j which option maps to which field. You can also specify the human-readable description for the option, which args4j will use to generate the usage screen.

I also added another field that receives arguments (the inputs to the command line other than options)

public class MyOptions {
    
    @Option(name="-r",usage="recursively run something")
    private boolean recursive;

    @Option(name="-o",usage="output to this file")
    private File out;

    @Option(name="-str")        // no usage
    private String str = "(default value)";

    @Option(name="-n",usage="usage can have new lines in it\n and also it can be long")
    private int num;

    // receives other command line parameters than options
    @Argument
    private List arguments = new ArrayList();
}

In the above example I annotated fields, but I can also annotate a setter method with the same annotation. That will cause args4j to invoke the setter instead of accessing the field directly. That allows you to perform additional semantic check on the parameter, or define a set of options that interact with each other in some application-specific fashion.

Given all those annotations, I can parse arguments just like this:

public void main(String[] args) throws IOException {
    MyOptions bean = new MyOptions()
    CmdLineParser parser = new CmdLineParser(bean);
    parser.parseArgument(args);

This will parse the string array as parameters, and args4j will set the values to fields or invoke setters appropriately.

What happens if the user types a wrong option name? Just surround the parseArgument method with a try-catch block like this:

try {
    parser.parseArgument(args);
} catch( CmdLineException e ) {
    System.err.println(e.getMessage());
    System.err.println("java -jar myprogram.jar [options...] arguments...");
    parser.printUsage(System.err);
    return;
}

CmdLineException contains a human-readable error message that you can just print out. Then you can also use args4j to generate a list of options. With args4j, you don't need to maintain a separate list of options just for showing the usage screen.

The benefit of using annotations is that you can generate the list of options not only at the runtime but also at the development time. args4j comes with a tool that lets you generate HTML/XML list of all options. This is ideal for keeping the documentation of your tool in sync with the code. This can be done by running the following command:

$ java -jar args4j-tools.jar path/to/MyOptions.java

Both the usage screen generation and the XML/HTML generation supports internalization adequately.

If you are interested in argsj, visit the project home page and play with it. Let me know how you think of this. In the future, I'm thinking about using this annotation to parse the Ant task into the same option bean, so that you can have a single code for the CLI and the Ant task interface.


Bookmark blog post: del.icio.us del.icio.us Digg Digg DZone DZone Furl Furl Reddit Reddit
Comments
Comments are listed in date ascending order (oldest first) | Post Comment

  • Kohsuke-,
    This rocks!

    Hristo

    Posted by: hr_stoyanov on May 11, 2005 at 11:03 AM

  • This project is quite useful, and also demonstrates effective use of annotations. I hope, for ease of use, the CmdLineException is an unchecked exceptioninstead of being a checked one.

    Posted by: inder on May 11, 2005 at 11:42 AM

  • I think CmdLineException should be a checked exception, because it can happen even if your application is written correctly, and you can usually do a meaningful recovery (like priting the usage screen.)

    Posted by: kohsuke on May 11, 2005 at 11:50 AM

  • Then the parseArgument method should return some sort of status, instead of forcing every user to block off one line of code in a try/catch statement.

    Or the printing of usage screen should be handled somehow through the CmdLineException.

    That's one thing I've hated about the design of several java-based command line parsers.

    Take a look at the Python argument parsing package. That's how you do it! :)

    Posted by: jaylogan on May 11, 2005 at 01:43 PM

  • Correction...
    Or the printing of usage screen should be handled somehow through the CmdLineParser class.

    Posted by: jaylogan on May 11, 2005 at 01:46 PM

  • So you are saying this should be


    try {
    parser.parseArgument(args);
    } catch( CmdLineException e ) {
    System.err.println(e.getMessage());
    System.err.println("java -jar myprogram.jar [options...] arguments...");
    parser.printUsage(System.err);
    return;
    }


    modified to this?


    if(!parser.parseArgument(args)) {
    System.err.println(e.getMessage());
    System.err.println("java -jar myprogram.jar [options...] arguments...");
    parser.printUsage(System.err);
    return;
    }


    ... and where do I get the error message? Or are you saying that we should return CmdLineException from the parseArgument method? By the time you are assigning the return value to a variable and so on, there's really no saving in terms of typing. So I'm not sure what is it that you don't like.
    Besides, this is so C.


    I'm curious about the Python package. Could you show me a pointer so that I can study it?

    Posted by: kohsuke on May 11, 2005 at 01:52 PM

  • The printUsage method is on the CmdLineParser class already. Am I missing your point?

    Posted by: kohsuke on May 11, 2005 at 01:53 PM

  • Do a google search for the following:

    optparse python

    The non-catch version allows for more flexible code.
    I think Effective Java mentions something about this.


    Putting something in an if statement is not necessarily "C"...just as forcing the user to wrap every single line of code in a try/catch is not necessarily Java.


    Think of the method Set.add(Object obj) What if every time you added something you had to put it in a try statement to see if the operation was successful?


    And since the CmdLineParser deliberately has a method to print usage, it stands to reason that you've deemed it normal operation that this could go wrong (i.e. non-exceptional).


    Just my opinion...but taking it out of try/catch makes for much more pleasant and usable code.

    Posted by: jaylogan on May 11, 2005 at 02:14 PM

  • It's not a matter of saving typing. It's a matter of flexibility in code writing.

    Posted by: jaylogan on May 11, 2005 at 02:16 PM


  • Thanks for the pointer to the parseopt package package. I'll look into it.


    I think the difference between Set.add and CmdLineParser.parseArgument is that adding a value to a set that already contains a value is not an error for many applications. I often write the code that ignores the return value myself. So it makes sense not to throw an exception. In the case of the parseArgument method, however, failing to parse the input almost always requires an application to abort the normal processing. Assuming that this is correct, I believe it makes sense to throw an exception.


    I think another reason for this being an exception is that often you want to define additional check for the option operand in your setter method, and when you find a problem, it's natural to signal that by throwing an exception.

    That said, I guess whether something should be checked or not, whether to throw an exception or not, etc are a matter of taste more or less. It's interesting to hear other perspectives.


    I agree that writing a catch block only to say "// never happen" is a pain (which happens a lot in JAXP, for example), but I don't think this applies to args4j. Finally, I'm curious, what is the flexibility of code writing that you get by returning a CmdLineException (or equivalent) as a return value, instead of throwing it as an exception?

    Posted by: kohsuke on May 11, 2005 at 02:36 PM

  • And please don't take what I'm saying in a bad way.
    I think your project is awesome. I'm just pointing out something that could improve the interface more.

    Posted by: jaylogan on May 11, 2005 at 02:38 PM

  • Yep, I appreciate your feedback.

    Posted by: kohsuke on May 11, 2005 at 02:39 PM

  • // An example...I have 2 different interfaces to a command line application.
    // One legacy interface that we are forced to support with different parameter types
    // Another interface that is more flexible.
    // I have 2 ArgParsers to support the 2 different interfaces.

    // Try #1 w/ catch...I'm sure I messed something up because I went too fast, but
    // I'll try anyway.
    ArgParser p = ....;
    ArgParser p2 = ....; // another type of interface.
    boolean weCouldParse = false;
    boolean defaultPossible;

    try
    {
    p.parseArgs();
    weCouldParse = true;
    }
    catch (CantParseException ex)
    {
    // We might be able to recover still.
    // Try another route.
    try
    {
    p2.parseArgs();
    weCouldParse = true;
    }
    catch (CantParseException ex2)
    {
    // We definitely can't parse this.
    // Provide a default.

    if (defaultPossible)
    {
    // do the default.
    }
    else
    {
    // exit the application with a message.
    // but do some cleanup first.
    }
    }
    }
    finally
    {
    // do some cleanup...
    // but not the cleanup that would occur if the default was possible.

    if (!defaultPossible)
    {
    // do cleanup
    }

    }

    // We're out of the big catch...how do we know what happened?
    if (weCouldParse)
    {
    // yay
    }
    .
    .
    .

    // Try #2 w/o catch...
    // pretend parseArgs() returned true if it worked, false if it didn't

    ArgParser p = ....;
    ArgParser p2 = ....; // another type of interface.
    boolean defaultPossible;

    if (p.parseArgs() || p2.parseArgs())
    {
    // yay
    }
    else
    {
    if (defaultPossible)
    {
    // do the default.
    }
    else
    {
    // exit the application but do some clean up first.
    }
    }


    Posted by: jaylogan on May 11, 2005 at 03:59 PM

  • kohsuke asked:

    ... and where do I get the error message?


    Well, the CmdLineParser is the thing that failed, so why can't it store the error message for retrieval later in the program?

    Posted by: jaylogan on May 11, 2005 at 08:25 PM

  • I see that the program has "graduated". It's unfortunate that you kept the requirement that the programmer must deal with exceptions, but at least you've made it so that we can adapt the software and make it more useful.

    Thank you!

    Posted by: jaylogan on May 28, 2005 at 09:25 PM

  • Hi Koshuke,

    Just came across your post and was wondering if you'd looked at my object-oriented command-line parsing library. It doesn't use the JDK 1.5 annotations, but I think from a usability point of view, it seems like there's less boilerplate code.


    One of the things I'm not quite sure how you would address with yours is localization of the help text messages. Maybe there's a way to do this with your approach, but I haven't yet looked at options in depth.


    Something I was really trying to address was to get rid of having gargantuan mainlines that checked options before doing something. Also, managing complicated option interactions can be quite a pain in large systems, so that's why I tried to implement this in the parser.


    If you didn't look at te-common, I'd be very interested in what you think. Basically, it just shows that there's about a zillion ways to solve the same problem. :)

    Posted by: atownley on June 20, 2005 at 04:29 AM

  • I think using annotations would achive less boiler-plate code (such as not needing to add options manually to a parser, or not needing to use any special wrapper class for each option), but I guess it's a subjective issue, because you seem to think differently.


    argsj4 does handle localization. You can pass in the resource bundle to the method that produces a help screen. Right now only documentation about this is javadoc.


    I never thought about dependency between options (probably because I personally never designed such options.) Annotations can't point to fields or methods, or it would be difficult to do nicely in args4j.


    Another thing that I'm trying to do with args4j is the unification of ant task and CLI, so that you only need to write one entry-point. For me this has been a bigger headache.


    Anyway, thank you for the pointer.

    Posted by: kohsuke on June 20, 2005 at 08:04 AM

  • Just tried out args4j. Looks nice. I wonder if it also would allow for specifying shortcuts for options. Example:

    My application accepts a command line option called "-check".
    It should be possible to specify "-c", "-ch", "-che", "-chec", "-check" and the parser should accept all the given variants (as long as there is no other option specified which starts with the characters "-c").

    Thanks a lot for the nice code.

    Posted by: adalbert on May 09, 2006 at 10:18 AM

  • Hello, I wrote a similar class, but it allows method overloading and strict type signatures for the parameters. The big work is done by the regular expression engine already present in Java.

    I also used Java Reflection and the Java 5.0 features, but I preferred to avoid meta-codes like annotations, for two reasons: one have to know the name of the meta-variables in annotations -- and even if the method is overloaded, it must generate still only ONE entry in the help. This condition can't be enforced by the syntax of metacodes.

    I would be glad if you give it a look or a try, and send me your opinion.

    clajr.sourceforge.net

    Posted by: lindoro on May 28, 2006 at 03:06 AM

  • Hi Koshuke,
    I just stumbled across while starting a new project requiring some complex command-line arguments.
    This is great!
    I going to make use of it right away.

    Mike

    Posted by: mikesullivan on November 04, 2006 at 06:40 AM

  • On your main page you state
    "Fully supports localization"
    Are there any special methods available for localization or arg4j just generically support localization?

    BTW, great library.

    Posted by: javamann on July 09, 2007 at 10:00 AM

  • The localization support refers to the ability to l10n usage messages by using the resource bundle.

    Posted by: kohsuke on July 09, 2007 at 12:55 PM

  • Hi again, thanks for the quick reply. Could you please post some examples on how to use localization because I have not been able to. It has been quite a while since I have played around with localization.

    Thanks.

    Posted by: javamann on July 10, 2007 at 12:19 AM

  • Sorry, I RTFM and found out how to do it.

    Thanks and this is a really cool product.

    -Pete

    Posted by: javamann on July 10, 2007 at 12:39 AM

  • Me again,
    Is there a way to print the word
    'Usage:'
    when you user the parser.printUsage method?
    Better yet, using the Resource Bundle?

    Thanks again

    -Pete

    Posted by: javamann on July 16, 2007 at 11:15 PM





Powered by
Movable Type 3.01D
 Feed java.net RSS Feeds