Skip to main content

Objects and Strings and the Wrangling Thereof (Part 2)

Posted by ljnelson on August 31, 2007 at 8:07 AM PDT

In the prior entry, we learned that Java ships with several tools to standardize the conversion between Strings and Objects. We covered the text conversion methods of java.beans.PropertyEditor. We will most definitely come back to PropertyEditors, because they're about a heck of a lot more than just converting text, but let's take a detour into the dank recesses of the java.text package.

Do you smell that? That's all the dust left over from when this package was deposited by IBM and Taligent into the official Java runtime platform 'round about 1997. java.text was put in, as I understand it, primarily for internationalization (I18N) purposes, and one of the bits that landed there was the java.text.Format class, which, although described as a way to format Locale-specific information, has nothing whatsoever to do with Locales or internationalization at all. Go figure. It does, however, have lots to do with converting Objects into Strings and vice versa.

Format

The Format class is responsible for formatting Objects into Strings, and taking Strings and parsing them back into Objects. Unlike PropertyEditor, whose API seems to have more plumbing and rebar and wiring than my house, a Format is a relatively simple thing. Want to parse a String into an Object? Call the stateless parseObject(String) method. Want to do the reverse? Call the format(Object) method. Easy and simple. And stateless.

java.text.Format also supports lots of substring parsing and formatting, via the ParsePosition and FieldPosition classes. In general, I've found the use cases for these to be pretty thin, except where Dates are concerned, and for that someone has already written the SimpleDateFormat class, so we'll pretty much deal with them only where necessary.

Let's pick up on our previous example and write a URIFormat class:

public class URIFormat extends Format {
  public URIFormat() {
    super();
  }

  // 1
  public StringBuffer format(final Object object, final StringBuffer buffer, final FieldPosition fieldPosition) {
    // 2
    if (buffer == null) {
      throw new NullPointerException("buffer == null");
    }

    // 3
    if (fieldPosition == null) {
      throw new NullPointerException("fieldPosition == null");
    }

    // 4
    if (object == null) {
      return buffer;
    }

    // 5
    if (!(object instanceof URI)) {
      throw new IllegalArgumentException(object + " is not a URI instance");
    }

    return buffer.append(object.toString());

  }
}

This method is relatively simple to implement properly and fully defensively. Here are the details (once again, numbers below correspond to numbers in the code comments above):

  1. java.text.Format's format(Object) method calls through to this one, which subclasses must implement. The general idea here is to sanity check our arguments, then investigate the FieldPosition to see what portion of the incoming object we should format, attempt the convert-it-to-a-String operation, and return the result. For our purposes we can skip the FieldPosition argument entirely (it's conceivable that you might want to implement a URIFormat that formats, for example, only the "authority" field of a given URI but we're not going to do that here).
  2. Note that the contract of this method requires us to throw a NullPointerException if buffer is null. I don't like this, but that's our contract. I'd prefer to simply return null, since it's obvious that what you put into this method comes out the back of it, but I didn't write the contract. Why not just let the NullPointerException happen naturally instead of proactively throwing it? Because there is no message that gets output by automatic NullPointerExceptions at all, and that is just evil.
  3. Continuing through the forest of NullPointerExceptions, we are required to throw one as well if the fieldPosition is null. Even though in the vast majority of cases you won't ever use this parameter.
  4. Now, what does the contract say about a null Object parameter? Fortunately, and correctly, it says nothing. That could be valid input for you. Since the contract is open here, I choose to practice leave-no-trace programming, and I simply return the buffer that was handed to me. That is, hey, I chose to "format" the null URI by doing exactly nothing. You could choose to do something different here. (If you choose to follow my example here, you'll reap the benefits later on, because we will be able to pass this value straight into a PropertyEditor implementation and honor its contract as a convenient side effect.)
  5. Next, we need to handle the case where invalid input is handed to us (null is not invalid input; there are many cases where you want to handle the formatting of null values; that's why we dealt with that as a separate case). So here the contract tells us, effectively, to reject bad input with an IllegalArgumentException. But, of course, it's up to you to decide what constitutes bad input. I've chosen to err on the conservative side and say, look, this Format I'm writing works only on URI instances. Anything else that it's asked to format won't work. You could also simply do nothing and return the incoming buffer as is.
  6. Now all we have to do is (again) use the well-documented and consistent toString() method from java.net.URI and append it to the buffer parameter. Since append() returns the resulting StringBuffer, we simply return that.

Again, the thought process behind this really simple method is what's important. You want to think hard about the what garbage is going to be coming in and what the caller will expect out.

The parseObject method is basically just the reverse, but I'll have a little more to say about it in comparison to the setAsText() method of PropertyEditor. It also has some of the weirdest error handling conventions I've ever seen. Basically, if there is an error, you return null, and use the incoming ParsePosition object to store where the error occurred during parsing. And in the success case, you are obligated to update the supplied parsePosition's regular index property to just past where you finished parsing.

So what if you have to return null, so you can, for example, capture and parse user input that should be converted to the null reference? It would appear, although the contract is not at all specific, that you return null, but make sure that the ParsePosition's errorIndex property is set to -1 to indicate that there is no error. Additionally, there's some intent we can read from the source code.

Diving into the Format source code, we see that the convenience method, parseObject(String), actually evaluates the return value of this method by more or less ignoring the contract entirely! It checks to see if this method has modified the parsePosition's index property. If it hasn't, then, it would appear, an error has occurred (since hey, parsing couldn't even take place), and, it turns out, regardless of this method's return value, an exception is thrown. So much for contracts, at least where Taligent is was concerned.

OK, then; here's how we do it:

//1
public Object parseObject(final String text, final ParsePosition parsePosition) {

  // 2
  if (parsePosition == null) {
    throw new NullPointerException("parsePosition == null");
  }

  // 3
  if (parsePosition.getErrorIndex() != -1) {
    return null;
  }

  // 4
  final int startIndex = parsePosition.getIndex();
  if (startIndex < 0) {
    parsePosition.setErrorIndex(startIndex);
    return null;
  }

  // 5
  if (text == null) {
    // This is not an error; this is a valid input value.  We want to return null.
    // So we take liberties with this soft and squishy area of the contract,
    // and we set the parsePosition's beginning to a value that is different
    // from what it initially was.  No one is going to try to parse a null
    // text twice, anyhow, and this actually prevents it.
    parsePosition.setIndex(-1);
    parsePosition.setErrorIndex(-1);
    return null;
  }

  // 6
  final int textLength = text.length();
  if (textLength <= 0 || text.trim().getLength() <= 0) {
    parsePosition.setIndex(-1);
    parsePosition.setErrorIndex(-1);
    return null;
  }

  // 7
  if (startIndex >= textLength) {
    parsePosition.setErrorIndex(startIndex);
    return null;
  }

  // 8
  try {
    final URI uri = new URI(text.substring(startIndex);
    parsePosition.setIndex(textLength);
    parsePosition.setErrorIndex(-1);
    return uri;
  } catch (final URISyntaxException kaboom) {
    parsePosition.setErrorIndex(kaboom.getIndex());
    return null;
  }

}
  1. This method is the one that all work is delegated to from the simpler parseObject(String) version. You have to implement it.
  2. The contract again, annoyingly, forces us to throw a NullPointerException if there's no ParsePosition.
  3. Here I check to see if the error index on the supplied ParsePosition has already been set. If it has, then it's not clear to me what I should do (the contract is silent; thanks, IBM!). I err on the side of being conservative and, since the parameter indicates that there's an error in play already, I return null per that part of the contract.
  4. I can't parse any text starting at a position less than zero, so we set the errorIndex of the parsePosition here and return null as instructed.
  5. If the supplied text is null, then I interpret this to mean that the user would like a null URI in return—i.e. he's trying, with text as his only weapon, to clear an object reference. But the contract to Format tells me I can't return null except in error conditions! But then the contract also tells me that ParsePosition has an errorIndex property to indicate where an error occurred. And the Format source code tells me that really the indicator of an error is whether the index has changed value or not. If it has not changed value, then some of the Format class' innards assume that since parsing didn't happen, an error must have occurred. (Pause for Buddhist-like contemplation of this leap of logic and faith.) So, we ensure that the parsePosition's index is set to a value that it could not have been set to by this point, but we also set the error index to -1 to indicate that the return value is to be considered valid. Whew.
  6. Zero-length Strings are interpreted as instructions to return a non-error-case null URI. So we jump through the hoops here to do that.
  7. If we were told to parse starting after the supplied String, then do what we need to do to indicate pilot error.
  8. Finally, parse the supplied String, starting where we're supposed to, and, if we succeed, remember to move the parsePosition's index! If we fail, then fortunately URISyntaxException actually has an index property (glory be!), so we set that and return null.

I hope if nothing else this little exercise should convince you, basically, to not replicate any object conversion logic in a java.text.Format. They are gross, disgusting, and despite their general purpose aura that hangs about them, more or less unsuitable for use outside the java.text package.

But.

They can be quite useful for bridging gaps between PropertyEditors and JFormattedTextFields, as we'll see in the next entry. The gist is that we'll create a Format that delegates to a PropertyEditor implementation. From there it's a drop-in to get your PropertyEditor to be used by JFormattedTextFields.

Thanks for reading.

Powered by ScribeFire.

Related Topics >>