 |
August 2007 Archives
Objects and Strings and the Wrangling Thereof (Part 2)
Posted by ljnelson on August 31, 2007 at 08:07 AM | Permalink
| Comments (0)
In the prior entry, we learned that Java ships with several tools to standardize the conversion between Strings and Objects. We covered the text conversion methods of java.beans.PropertyEditor. We will most definitely come back to PropertyEditors, because they're about a heck of a lot more than just converting text, but let's take a detour into the dank recesses of the java.text package.
Do you smell that? That's all the dust left over from when this package was deposited by IBM and Taligent into the official Java runtime platform 'round about 1997. java.text was put in, as I understand it, primarily for internationalization (I18N) purposes, and one of the bits that landed there was the java.text.Format class, which, although described as a way to format Locale-specific information, has nothing whatsoever to do with Locales or internationalization at all. Go figure. It does, however, have lots to do with converting Objects into Strings and vice versa.
Format
The Format class is responsible for formatting Objects into Strings, and taking Strings and parsing them back into Objects. Unlike PropertyEditor, whose API seems to have more plumbing and rebar and wiring than my house, a Format is a relatively simple thing. Want to parse a String into an Object? Call the stateless parseObject(String) method. Want to do the reverse? Call the format(Object) method. Easy and simple. And stateless.
java.text.Format also supports lots of substring parsing and formatting, via the ParsePosition and FieldPosition classes. In general, I've found the use cases for these to be pretty thin, except where Dates are concerned, and for that someone has already written the SimpleDateFormat class, so we'll pretty much deal with them only where necessary.
Let's pick up on our previous example and write a URIFormat class: public class URIFormat extends Format {
public URIFormat() {
super();
}
// 1
public StringBuffer format(final Object object, final StringBuffer buffer, final FieldPosition fieldPosition) {
// 2
if (buffer == null) {
throw new NullPointerException("buffer == null");
}
// 3
if (fieldPosition == null) {
throw new NullPointerException("fieldPosition == null");
}
// 4
if (object == null) {
return buffer;
}
// 5
if (!(object instanceof URI)) {
throw new IllegalArgumentException(object + " is not a URI instance");
}
return buffer.append(object.toString());
}
} This method is relatively simple to implement properly and fully defensively. Here are the details (once again, numbers below correspond to numbers in the code comments above):- java.text.Format's format(Object) method calls through to this one, which subclasses must implement. The general idea here is to sanity check our arguments, then investigate the FieldPosition to see what portion of the incoming object we should format, attempt the convert-it-to-a-String operation, and return the result. For our purposes we can skip the FieldPosition argument entirely (it's conceivable that you might want to implement a URIFormat that formats, for example, only the "authority" field of a given URI but we're not going to do that here).
- Note that the contract of this method requires us to throw a NullPointerException if buffer is null. I don't like this, but that's our contract. I'd prefer to simply return null, since it's obvious that what you put into this method comes out the back of it, but I didn't write the contract. Why not just let the NullPointerException happen naturally instead of proactively throwing it? Because there is no message that gets output by automatic NullPointerExceptions at all, and that is just evil.
- Continuing through the forest of NullPointerExceptions, we are required to throw one as well if the fieldPosition is null. Even though in the vast majority of cases you won't ever use this parameter.
- Now, what does the contract say about a null Object parameter? Fortunately, and correctly, it says nothing. That could be valid input for you. Since the contract is open here, I choose to practice leave-no-trace programming, and I simply return the buffer that was handed to me. That is, hey, I chose to "format" the null URI by doing exactly nothing. You could choose to do something different here. (If you choose to follow my example here, you'll reap the benefits later on, because we will be able to pass this value straight into a PropertyEditor implementation and honor its contract as a convenient side effect.)
- Next, we need to handle the case where invalid input is handed to us (null is not invalid input; there are many cases where you want to handle the formatting of null values; that's why we dealt with that as a separate case). So here the contract tells us, effectively, to reject bad input with an IllegalArgumentException. But, of course, it's up to you to decide what constitutes bad input. I've chosen to err on the conservative side and say, look, this Format I'm writing works only on URI instances. Anything else that it's asked to format won't work. You could also simply do nothing and return the incoming buffer as is.
- Now all we have to do is (again) use the well-documented and consistent toString() method from java.net.URI and append it to the buffer parameter. Since append() returns the resulting StringBuffer, we simply return that.
Again, the thought process behind this really simple method is what's important. You want to think hard about the what garbage is going to be coming in and what the caller will expect out.
The parseObject method is basically just the reverse, but I'll have a little more to say about it in comparison to the setAsText() method of PropertyEditor. It also has some of the weirdest error handling conventions I've ever seen. Basically, if there is an error, you return null, and use the incoming ParsePosition object to store where the error occurred during parsing. And in the success case, you are obligated to update the supplied parsePosition's regular index property to just past where you finished parsing.
So what if you have to return null, so you can, for example, capture and parse user input that should be converted to the null reference? It would appear, although the contract is not at all specific, that you return null, but make sure that the ParsePosition's errorIndex property is set to -1 to indicate that there is no error. Additionally, there's some intent we can read from the source code.
Diving into the Format source code, we see that the convenience method, parseObject(String), actually evaluates the return value of this method by more or less ignoring the contract entirely! It checks to see if this method has modified the parsePosition's index property. If it hasn't, then, it would appear, an error has occurred (since hey, parsing couldn't even take place), and, it turns out, regardless of this method's return value, an exception is thrown. So much for contracts, at least where Taligent is was concerned.
OK, then; here's how we do it: //1
public Object parseObject(final String text, final ParsePosition parsePosition) {
// 2
if (parsePosition == null) {
throw new NullPointerException("parsePosition == null");
}
// 3
if (parsePosition.getErrorIndex() != -1) {
return null;
}
// 4
final int startIndex = parsePosition.getIndex();
if (startIndex < 0) {
parsePosition.setErrorIndex(startIndex);
return null;
}
// 5
if (text == null) {
// This is not an error; this is a valid input value. We want to return null.
// So we take liberties with this soft and squishy area of the contract,
// and we set the parsePosition's beginning to a value that is different
// from what it initially was. No one is going to try to parse a null
// text twice, anyhow, and this actually prevents it.
parsePosition.setIndex(-1);
parsePosition.setErrorIndex(-1);
return null;
}
// 6
final int textLength = text.length();
if (textLength <= 0 || text.trim().getLength() <= 0) {
parsePosition.setIndex(-1);
parsePosition.setErrorIndex(-1);
return null;
}
// 7
if (startIndex >= textLength) {
parsePosition.setErrorIndex(startIndex);
return null;
}
// 8
try {
final URI uri = new URI(text.substring(startIndex);
parsePosition.setIndex(textLength);
parsePosition.setErrorIndex(-1);
return uri;
} catch (final URISyntaxException kaboom) {
parsePosition.setErrorIndex(kaboom.getIndex());
return null;
}
}
- This method is the one that all work is delegated to from the simpler parseObject(String) version. You have to implement it.
- The contract again, annoyingly, forces us to throw a NullPointerException if there's no ParsePosition.
- Here I check to see if the error index on the supplied ParsePosition has already been set. If it has, then it's not clear to me what I should do (the contract is silent; thanks, IBM!). I err on the side of being conservative and, since the parameter indicates that there's an error in play already, I return null per that part of the contract.
- I can't parse any text starting at a position less than zero, so we set the errorIndex of the parsePosition here and return null as instructed.
- If the supplied text is null, then I interpret this to mean that the user would like a null URI in return—i.e. he's trying, with text as his only weapon, to clear an object reference. But the contract to Format tells me I can't return null except in error conditions! But then the contract also tells me that ParsePosition has an errorIndex property to indicate where an error occurred. And the Format source code tells me that really the indicator of an error is whether the index has changed value or not. If it has not changed value, then some of the Format class' innards assume that since parsing didn't happen, an error must have occurred. (Pause for Buddhist-like contemplation of this leap of logic and faith.) So, we ensure that the parsePosition's index is set to a value that it could not have been set to by this point, but we also set the error index to -1 to indicate that the return value is to be considered valid. Whew.
- Zero-length Strings are interpreted as instructions to return a non-error-case null URI. So we jump through the hoops here to do that.
- If we were told to parse starting after the supplied String, then do what we need to do to indicate pilot error.
- Finally, parse the supplied String, starting where we're supposed to, and, if we succeed, remember to move the parsePosition's index! If we fail, then fortunately URISyntaxException actually has an index property (glory be!), so we set that and return null.
I hope if nothing else this little exercise should convince you, basically, to not replicate any object conversion logic in a java.text.Format. They are gross, disgusting, and despite their general purpose aura that hangs about them, more or less unsuitable for use outside the java.text package.
But.
They can be quite useful for bridging gaps between PropertyEditors and JFormattedTextFields, as we'll see in the next entry. The gist is that we'll create a Format that delegates to a PropertyEditor implementation. From there it's a drop-in to get your PropertyEditor to be used by JFormattedTextFields.
Thanks for reading.
Powered by ScribeFire.
Objects and Strings and the Wrangling Thereof
Posted by ljnelson on August 27, 2007 at 10:49 AM | Permalink
| Comments (2)
See if this little scenario sounds familiar.
You're rolling along on some application somewhere, and you've decided to put some information in a Properties file somewhere. You realize that you're beginning to encode a lot of information in a property setting, so much so that you realize that really what you're doing is building up a rather complicated Object. You feel like you've done this before. Or this one:You're working on a Swing application, and you need to validate the input from the user. Great, you say, I'll use a JFormattedTextField, and then I'll...I'll...I'll...read the...documentation...which features lots of...hmm...factories...and AbstractFormatters...and still more formats and navigation thingees and...I think I'll go get some coffee. Or this one:You're halfway through developing a complicated and enterprisey validation framework and you stop abruptly, realizing that there has to be a better way! Starting with this blog entry I'd like to cover the many, many different ways to edit, format and build up different kinds of Objects that are provided by the Java platform.
Why should you think about turning Objects into Strings? Or Strings into Objects? Or all the other ways that a user might send input to you?
For me, the answer is that whether you're developing on the desktop or on the Web, you are constantly accepting free-form user input in the form of text. In some cases, you have control over this text, and in other cases you do not, but in all cases you often need to turn that text into things like dates, colors, fonts, java.net.URIs, custom domain objects and the like. Wouldn't it be nice to come up with a standard set of tools that would manage this kind of conversion for you in a pluggable manner? Wouldn't it be even better if most of that heavy lifting were done for you by the base platform? Well, it is.
In this entry, I'll cover java.beans.PropertyEditor—at least I'll cover some of its features—and then will move onto other tools in subsequent entries.PropertyEditorsThe java.beans.PropertyEditor class is quite a powerful beast. Despite the fact that it's heavily used by Spring, Geronimo, JBoss and doubtless a whole host of other reasonably popular open source products, it seems to get lost in the shuffle, and developers often don't know it exists.
A PropertyEditor exists to edit Java bean properties. Or at least that's what the documentation will tell you. But explaining it this way often leads to questions about what a Java bean is, what a property is, oh, doesn't that mean getters and setters and whatnot—all of which is largely irrelevant for understanding what a PropertyEditor actually does.
So, then: a PropertyEditor provides a consistent way for accepting user-supplied data and turning it into an object of a particular class. For now, we'll look at just its text conversion utilities and leave its boatload of other features aside.
Before we dive into the API, we should think about what we're trying to do. So for this example, let's say that we're going to make a PropertyEditor that converts java.net.URI instances into Strings and vice versa. That is a trivial enough example that it should be easy to step through, and a useful enough one that you may find yourself using the resulting PropertyEditor in an actual project.
To start developing a new PropertyEditor, it's most helpful to simply subclass java.beans.PropertyEditorSupport, so that's what we'll do here. For reasons that will become clear later (hint: see the API documentation for java.beans.PropertyEditorManager), you should suffix your property editors with the word Editor. Finally, the API documentation also says that all PropertyEditors must have a public, no-arg constructor:public class URIEditor extends java.beans.PropertyEditorSupport { public URIEditor() { super(); } } Next, let's focus on the text conversion methods:public String getAsText(); public void setAsText(final String text) throws IllegalArgumentException; public Object getValue(); public void setValue(final Object value); Actually, before we get there, an overriding thing that is worth remembering is that a PropertyEditor is a stateful object. Granted, if all is going as it should, the state doesn't stick around very long, and technically speaking doesn't even get released back into the wild, but it's there nonetheless. That means the typical flow of using a PropertyEditor for conversion purposes is:- Stick a value into the editor.
- Ask the editor to get you its current value's String representation.
Or, equivalently:- Try to set some text into the editor. If this call completes, the editor will have a value ready and waiting for you.
- Get the value out of the editor that resulted from the conversion process.
So back to the methods. Let's start (as is always good) by implementing the setAsText() method first. Then to be good and defensive and bulletproof, we'll beef up the getValue() and setValue() methods in a moment.public void setAsText(final String text) { // 1 if (text == null) { // null text means a null URI this.setValue(null); } else { // 2 final String trimmedText = text.trim(); assert trimmedText != null; // guaranteed by trim() method // 3 if (trimmedText.length() <= 0) { // Empty text means a null URI this.setValue(null); } else { // We have a non-null, non-zero-length String. Let's attempt to turn it into a URI. try { // 4 this.setValue(new URI(trimmedText)); } catch (final URISyntaxException kaboom) { // 5 throw (IllegalArgumentException)new IllegalArgumentException(text).initCause(kaboom); } } } } What's going on here? Why is this so complicated? Let's walk the interesting bits of the code. The numbers in the list below should correspond to the numbers in the code above.- I am a defensive programmer, so I tend to overdo this part (in some people's opinion).That is, I always check for nulls, even in situations where it can be "guaranteed" that null will never be supplied or returned. So here, I make the decision that if null is supplied to us, I will interpret that as someone wanting to clear out the value stored by this editor. To do that, we simply set the value to null.
- Here, you may do this step or not, depending on your requirements. I trim the supplied text to eliminate leading and trailing whitespace. The assertion step is entirely optional; I use it more for documentation than anything else.
- Once the text is trimmed, I can make a quick call to get its length, and if the string is empty, then I interpret that to mean that the user is, again, trying to clear out the value. So I set the value to null. I suppose you could be draconian and attempt to construct a URI with an empty string, but I don't really see the point.
- By the time we get here, we know we have a "full" String to work with. PropertyEditors always have to return a copy of whatever value is set on them, but in the case of URIs, the URI itself is immutable, so when we call setValue here with a new URI, we will not have to do anything special in our getValue method later on to copy the installed value. More on this later.
- The only specification-compliant option you have for rejecting input is to throw an IllegalArgumentException. So here we catch the URISyntaxException, and, using some rather weird syntax, wrap that URISyntaxException inside an IllegalArgumentException. The initCause() call is needed to make this code compile under 1.4 environments. As a convenience, I supply the bad text itself (untrimmed, unprocessed) in the IllegalArgumentException's message.
What's important about the thought process behind implementing this (very simple) method is, I think, the following:- Handle all kinds of input. If you do this up front, you won't be surprised later, and your code will perform gracefully under pressure. Can you accept dirty input? Null? What happens if there is whitespace? Can you trim some of it away? Or does that disturb the semantics of your object-to-string conversion? In the case of URIs, there's really no harm in trimming the whitespace fat.
- Think about providing a way, using just text, to clear out a value. Consider, also, when this might be a bad idea. This PropertyEditor implementation lets you clear the value by calling setAsText with either "" or null.
- Work within the specification as much as you possibly can. Although of course you're free to throw all sorts of different kinds of RuntimeExceptions whenever you want in a vacuum, the PropertyEditor specification says that the only RuntimeException (really, the only kind of Exception, period) that callers will expect is an IllegalArgumentException. So throw one of those when things go bad. I always make the bad text be the message, because that will show up in a stack trace in various log files. Also, callers of PropertyEditors will almost always catch a simple plain-Jane IllegalArgumentExcepiton, and won't know how to access any other values about it other than its message and cause.
The getAsText() method is much simpler. Its only responsibility is to return the text representation of the currently installed value. But be careful (and now you'll understand why I went to great lengths to trim the text in the setAsText() method above), because the specification says that if you return null from this method it is to be interpreted to mean that your PropertyEditor is unable to perform the text conversion. So here's our getAsText() method:public String getAsText() { // 1 final Object value = this.getValue(); // 2 if (value == null) { return ""; } // 3 return value.toString(); } And here are the details:- Although we know we're always going to be working with java.net.URIs here, we technically speaking don't care what kind of value was installed, because ultimately (see 4) we're just going to return its String representation. So no cast needed here.
- This is the only gotcha in the contract for this method. The specification tells us indirectly that even when our installed value is null we must not return null from this method—unless we want to indicate that something broke in the conversion process, or that we simply don't support text conversion at all. Instead, we choose here to return the empty string, which is a good value to use in text fields and form input fields. And that's why we explicitly test for whitespace-only Strings in the setAsText() method, because of course the return value from the getAsText() method is often indirectly resupplied as the input to the setAsText method.
- Finally, we defer to the URI#toString() method for the "real" String representation because it just so happens to have really well-defined semantics. In more complicated cases, you may very well not want to defer to the toString() method. Specifically, you need to have a guarantee if you do defer to the toString() method that it will not return null, because although that may be OK for its contract, it's not OK for the PropertyEditor#getAsText() contract.
Lastly, we'll override the "value" methods just to make sure that they're nice and defensive and well-documented:public Object getValue() { final Object uri = super.getValue(); // 1 assert uri == null || uri instanceof URI; return uri; }
public void setValue(final Object value) { // 2 if (value == null || value instanceof URI) { super.setValue(value); } }
- The first thing we do is to make sure that no matter what happens later on in this class' lifecycle—no matter who "maintains" it after us, no matter what overridden methods in no matter what subclasses get butchered, we are intending this class to handle only java.net.URI instances and nothing else (well, except for Strings).
- In the setValue() method, we have a dilemma. The specification is silent about how a PropertyEditor is supposed to handle setValue() calls that it's not designed to handle. Once again, you can choose to throw a RuntimeException of your own choosing—the compiler certainly won't stop you and you're fully within your rights—but the typical caller of a PropertyEditor (usually some bit of infrastructure code) is not expecting you to call down fire and brimstone from this method. I tend to write my setValue() implementations to ignore invalid input. I obviously log the problem in such cases, but omitted the logging code here as installing logging here would be over the top for this example. Finally, recall that the instanceof operator will return false if null is its left-hand operand, so you have to explicitly check for that (in order to permit clearing out the installed value).
CallersFor reasons that I'll cover in the next installment, callers will expect to call this PropertyEditor in a way that looks something like this:final PropertyEditor pe = // Get an appropriate property editor final String text = // Gather user text if (pe != null) { try { pe.setAsText(text); } catch (final IllegalArgumentException badText) { // Aha; the user input must have been bad. handleErrorVisuallyOrOtherwise(badText); return; } final Object canonicalValue = pe.getValue(); final String canonicalText = pe.getAsText(); if (canonicalText == null) { // Ah; this must mean that this particular PropertyEditor can't represent the value. // I'd better do something more fancy. maybeCreateSomeKindOfImage(canonicalValue); } }
Observations and SummarySo, then, for Object-to-String conversion using PropertyEditors, keep the following things in mind:- Subclass PropertyEditorSupport.
- Accept lousy input in your setAsText() method.
- Never return null from your getAsText() method.
- Define a non-null String you're going to use to represent the null value. Then make sure that your getAsText() method returns that to mean null, and that your setAsText() method is prepared to accept it to mean null.
- Decide what to do when your setValue() method gets called with unexpected input.
- Remember that you report invalid user text by throwing an IllegalArgumentException from your setAsText() method, and that's the only thing your callers are going to be prepared for.
If you build PropertyEditors bearing all these points in mind, then you will have a stable foundation for standardized Object-to-String conversion with well-defined semantics. We'll see in the next installment how to make all of Java's other Object-converting tools expressible in terms of PropertyEditors. That, in turn, will let you define all your conversion logic in one place.
In the next article, I'll cover java.text.Format, and will put togther a Format implementation that delegates to an underlying PropertyEditor. See you then.
Powered by ScribeFire.
|