Search |
||
String's equals method isn't always enoughPosted by joconner on June 28, 2006 at 1:24 AM PDT
I read Ethan Nicholas' blog about intern'd strings with great interest. I agree with his assessment that using '==' to compare String objects is almost never correct. He suggests that String's The Problem with String's equalsThe problem shows up when you want to compare text linguistically...like you do when you use a standard word dictionary. The The problem is that there are often multiple ways to represent the same text in Unicode. For example, the name "Michèle" can also be represented as "Miche`le" in Unicode. The second version of the name uses a "combining sequence" ('e' + '`') to represent 'è'. String's simplistic The following code snippet prints this: The strings are not equal.
String name1 = "Michèle";
String name2 = "Miche\u0300le"; //U+0300 is the COMBINING GRAVE ACCENT
if (name1.equals(name2)) {
System.out.println("The strings are equal.");
} else {
System.out.println("The strings are not equal.");
}
The Problem with String's compareToThe The following snippet prints this: Hat < cat
String w1 = "cat";
String w2 = "Hat";
int comparison = w1.compareTo(w2);
if (comparison < 1) {
System.out.printf("%s < %s\n", w1, w2);
} else {
System.out.printf("%s < %s\n", w2, w1);
}
When are These Results Wrong?If you're trying to sort a list of names, the results of String's What is a Collator?The If you used a You should know that a Collator is locale sensitive. That is, it performs differently depending upon the locale for which it is created. Different geographic regions compare words differently, using different rules for which letters and accents come before (and after) others. Let's look at some comparisons using a The following comparison prints this: The strings are equal.
...
Collator collator = Collator.getInstance(Locale.US);
String name1 = "Michèle";
String name2 = "Miche\u0300le";
int comparison = collator.compare(name1, name2);
if (comparison == 0) {
System.out.println("The strings are equal.");
} else {
System.out.println("The string are not equal.");
}
If you browse around the Collator javadoc for long, you'll notice that it has various properties that you can set to modify its comparison behavior. That stuff is interesting, and you'll need to learn more about it sometime, but it's not important for the discussion here. The main point I want you to understand is simply this: String's The ConclusionTake a look at the Collator class to determine when a linguistic comparison might be more appropriate than a simple One Last ThingCan you guess which class (String or Collator) can help you match the word "Michèle" even though a user enters "Michele" (without the accent) into your application? »
Related Topics >>
Programming Comments
Comments are listed in date ascending order (oldest first)
|
||
|
|