Skip to main content

Refactoring Translations

Posted by evanx on May 26, 2006 at 5:20 AM PDT

Introduction.

"There is no problem that cannot be solved by the use of high explosives."

locale.png
Recently I was tasked with making an app translatable. It was a relatively small Swing app, e.g. 200 classes.

That means moving strings, like exception messages, into a resource bundle. I had some fun with a phased approach, which I present here.

Moving the strings

"The best armor is staying out of gun-shot."

The first phase was moving the string literals in the code into a "message class" as below.

public class IMessage {
    public static String systemErrorLogin = "~ logging in";
    public static String systemBusyCommunicatingWithServer = "Communicating with the server...";
    public static String systemErrorOccurred = "An error has occurred";
    public static String systemErrorOccuredFormatTilde = "An error has occurred while %s";
    public static String systemSendOpLogoffReq = "~ sending logoff message";
    public static String systemUpdateError = "~ updating application";
    public static char systemLoginMnemonic = 'L';
    public static String[] periodOptions = {"Today", "Yesterday", ...};
    ...
}

Note that we allow different types in the message class, i.e. char for mnemonics, and string arrays for combo boxes.

(Incidently, in the above example, we use a notation where a tilde at the beginning of exception messages is substituted with "An error has occurred while..." It's lazy, and that's what I really dig about it, man!

The application code becomes "stringless" as follows.

   public void run() {
      try {
         login();
      } catch (Exception e) {
         e.printStackTrace();
         gui.showExceptionDialog(e, IMessage.systemErrorLogin);
      }
   }

jabber_protocol.png
An advantage of this approach, is that the "keys" we are choosing are refactorable field names, e.g. systemErrorLogin, rather than string references. Once all strings have been refactored out into this message class, we can review the keys for naming consistency, spelling, etcetera. Renaming them is safe and easy, e.g. using Netbean's refactorings.

I sometimes argue that "string references" (eg. resource bundle keys in this case) hinder refactoring, and so we should aim for applications with "no string references attached."

Generating the resource bundle

"The best tank terrain is that without anti-tank weapons."

The second phase is to generate the resource bundle from this message class. We use the field name as the key, and use reflection to generate the resource bundle as follows.

   public void generateResourceBundleContent() throws Exception {
      IMessage messages = new IMessage();
      for (Field field : IMessage.class.getFields()) {
         // iterate through all the fields in the message class
         String key = field.getName();
         Object value = field.get(messages);
         if (field.getType() == String.class) {
            // output a regular string message 
            logger.println(key + " = " + value);
         } else if (field.getType() == String[].class) {
            // output a string array, e.g. combo box items
            String[] array = (String[]) value;
            int index = 0;
            for (String string : array) {
               logger.println(key + index + " = " + array[index]);
               index++;
            }
         } else if (field.getType() == char.class) {
            // output a char, e.g. a mnemonic
            logger.println(key + " = " + value);
         }
      }
   }  

In the above method, we generate the content to be cut and pasted into our resource bundle file e.g. myapp_en.properties. Note that we handle string arrays by appending an index digit to the key.

Loading the resource bundle

"Samuel Morse must have lost his mind if he believes in this Dots and Dashes idea himself!" A Government Official (1842).

kdmconfig.png
Now we can translate the resource bundle file into other languages, e.g. myapp_de.properties for German. When our application starts up, we need to load the messages for the current locale's resource bundle. Firstly, to simplify the processing, we find it convenient to read the resource bundle into a Map, as follows.

    public static final Map resourceMap = new HashMap();

    public void loadMessages() throws Exception {
       for (Enumeration it = resourceBundle.getKeys(); it.hasMoreElements();) {
          String key = it.nextElement();
          String value = resourceBundle.getString(key);
          resourceMap.put(key, value);
        }
        logger.exiting(resourceMap.size());
    }

So now we gonna load the resource bundle messages into our messages class (which is otherwise still initialised to the original English strings). We use reflection, as follows.

   public void configureMessages() throws Exception {
      IMessage messages = new IMessage();
      for (Field field : IMessage.class.getFields()) {
         // iterate through all the fields in the message class         
         String key = field.getName();
         Object defaultValue = field.get(messages);
         Object resourceValue = null;
         if (field.getType() == String.class) {
            // we are looking for a regular string message
            resourceValue = resourceMap.get(key);
            if (resourceValue == null) {
               throw new IRuntimeException(field);
            }
            field.set(messages, resourceValue);
         } else if (field.getType() == String[].class) {
            // we are looking for an array of strings in this case
            String[] defaultArray = (String[]) defaultValue;
            List stringList = new ArrayList();
            for (int index = 0;; index++) {
               String string = resourceMap.get(key + index);
               if (string == null) break;
               stringList.add(string);
            }
            if (stringList.size() != defaultArray.length) {
              throw new IRuntimeException(field);
            }
            resourceValue = stringList.toArray(new String[stringList.size()]);
            field.set(messages, resourceValue);
         } else if (field.getType() == char.class) {
            // we are looking for a char resource
            String resourceString = resourceMap.get(key);
            if (resourceString == null || resourceString.length() != 1) {
               throw new IRuntimeException(field);
            } 
            resourceValue = resourceString.charAt(0);
            field.set(messages, resourceValue);
        }
    }

Note that in the above code, if an entry is found in the resource bundle that is inconsistent with the messages class, e.g. an unrecognised key, or different length string array, then an exception is thrown. This should be performed as a unit test. Anyway we will know as soon as we run the application if our resource bundle is not as it should be (via an exception).

Testing

"Airplanes suffer from so many technical faults that it is only a matter of time before any reasonable man realizes that they are useless." - Scientific American (1910)

fonts.png
Resource bundles, with string literals as keys in the code, e.g. getString("loginError"), are fragile. For example, a misspelt key for some obscure exception message, might only be picked up (as a "dangling" string reference) when that exception occurs. That might only happen down the line, in production.

An advantage of the message class approach, is that it enables unit testing of our resource bundles. For example, we can easily test that every one of our messages (as declared in the message class) is translated in our resource bundles, as follows.

   public void test() throws Exception {
      IMessage messages = new IMessage();
      for (Field field : IMessage.class.getFields()) {
         String key = field.getName();
         if (resourceMap.get(key) == null) {
            throw new IRuntimeException(key);
         }
      }
   }

The above code sample is over-simplified, but hopefully illustrates the point.

Merging messages

"No flying machine will ever fly from New York to Paris." - Orville Wright.

appearance.png
What may be useful, is to generate the content of a message class from an existing resource bundle (eg. one produced using Netbeans GUI designer, for labels and such), as below. Then we can cut and paste that content into our message class. In this case, we can identify name clashes, and also happily generate the resource bundle file in its entirety later (from the message class, as shown above).

   public void emitMessagesCode() throws Exception {
      for (String key : resourceMap.keySet()) {
         String value = resourceMap.get(key);
         logger.println("public static String " + key + " = "" + value + "";");
      }
   }

To Bundle, or not to bundle?

"We are not retreating - we are advancing in another direction." - Gen. Douglas MacArthur

yast_babelfish.png
Another option is to translate messages in code as below. The advantage of this approach is that the keys remain refactorable. And then translators can use Netbeans, and commit directly to the source code CVS, yay!

   public void installGerman() {
      IMessage.systemError = "Eine Störung trat auf";
      ...
   }

Conclusion

"I have a catapult. You will agree to my terms, or I will fling an enormous rock at your city." - Latin literature.

We introduce a phased approach for translating an application. First, we move strings into a message class. This is achieved by cutting and pasting strings out of application classes into the message class. (Additionally, an existing resource bundle, as produced by the Netbeans GUI designer for example, might be merged into the messages class, with the assistance of some trivial code generation.)

This first phase enables us to ensure neat and consistent naming of the keys we use to reference messages. For example, we can readily rename the message keys using IDE refactorings, to correct spelling mistakes and inconsistent naming conventions.

mozillacrystal.png
The next phase is to generate the master resource bundle from the message class. We use reflection on the message class to generate the key/value pairs, which we cut and paste into the master resource bundle file. After this stage, the resource bundle can be translated into multiple languages.

At startup, the application loads the resource bundle for the current locale, and uses reflection to configure the message class from the resource bundle. This offers a mechanism to ensure that the resource bundle is consistent, i.e. there are no strings that remain untranslated.

In general, I argue that source code should contain no string literals whatsoever! The reason for this is that string literals are typically fragile references, which are not refactorable. This applies to strings that refer to field or method names as discussed in my earlier blog "Explicit Reflection", and string references to properties, as discussed in "Bean Curd (Chapter 1)". (Strings used in OR queries will be discussed in an up-coming blog, "Bean Curd 2: Native Query Beans.")

Clearly strings that are text messages are also undesirable, because they should be externalised for translation (in resource bundles).

And finally string references to externalised messages in resource bundles, are fragile and unable to be unit tested, and consequently dangerous, e.g. getString("loginError").

So I think that covers all the evil strings that we might find lurking in our code? Let's root them out and banish them forever!

Related Topics >>