The Source for Java Technology Collaboration
User: Password:



Rémi Forax's Blog

October 2006 Archives


JSR 277 and ahead of time compilation

Posted by forax on October 19, 2006 at 07:27 AM | Permalink | Comments (4)

Last week, an early draft of Java Module System (JSR277) was published. To sum up, this draft defines :

  1. the concept of exportable module, which is closed to OSGi bundle or eclipse plugin.
  2. the way modules are packaged, as super jar named Java Archive Module (or jam).
  3. and how modules are store in a local repository.

One short paragraph at the end of the section 12 of the chapter 7, the one about module repository, awakes my geek instinct. I quote:

 This deployment step offers the repository implementation an
 opportunity to transform a module definition into a more efficient
 format for run time access. For instance,
 [...]
 Native code can be generated for the frequently used classes in a
 module definition to optimize runtime performance.

Let me try to explain what is the idea behind : when a module is stored in the local repository, the code is automatically transformed in native code by performing a static analysis on the bytecode. This technique is called Ahead of Time compilation and is, at least, used today by Excelsior JET runtime environment. Then when the VM try to execute a class of this module, the native code is used for the first runs before the hotspot detection that performs a runtime analysis provide a better code.

Oh, oh, will we see in a near future, more VMs providing a module repository implementing ahead of time compilation ? And thus VMs that never interpret bytecodes !

Rémi Forax



Obfuscated Java

Posted by forax on October 17, 2006 at 12:36 AM | Permalink | Comments (1)

With a colleague, we discuss about the fact that function type can or not obfuscate Java code and he advocate the fact that a code in Java is always readable if you have an IDE that can auto-format the code.
The following example doesn't answer to the question about function type but shows that it is always possible to write an obfuscated code even in Java. This example is extracted from the BlueJ mailing list.

/*   Just Java
     Peter van der Linden
     April 1, 1996.

\u0050\u0076\u0064\u004c\u0020\u0031\u0020\u0041\u0070\u0072\u0039\u0036
 \u002a\u002f\u0020\u0063\u006c\u0061\u0073\u0073\u0020\u0068\u0020\u007b
  \u0020\u0020\u0070\u0075\u0062\u006c\u0069\u0063\u0020\u0020\u0020\u0020
   \u0073\u0074\u0061\u0074\u0069\u0063\u0020\u0020\u0076\u006f\u0069\u0064
    \u006d\u0061\u0069\u006e\u0028\u0020\u0053\u0074\u0072\u0069\u006e\u0067
     \u005b\u005d\u0061\u0029\u0020\u007b\u0053\u0079\u0073\u0074\u0065\u006d
      \u002e\u006f\u0075\u0074\u002e\u0070\u0072\u0069\u006e\u0074\u006c\u006e
       \u0028\u0022\u0048\u0069\u0021\u0022\u0029\u003b\u007d\u007d\u002f\u002a

 */

For those that have problems with reading unicode characters encoded in hexa :)
It's equivalent to

class h {
  public static void main(String[] args) {
    System.out.println("Hi!");
  }
}

Rémi Forax



Is bug 6472193 a showstopper ?

Posted by forax on October 12, 2006 at 05:57 AM | Permalink | Comments (7)

Recently a user named jirkahana post a topic on the java.net JDK forum about the fact that javax.xml.stream.XMLReader inherits from a non generified Iterator.
I've kickly sumbitted a bug report (6472193 and as i'm a jdk contributor, i've provided a patch that correct the bug.

Now, we are late in the sceduling of mustang, and only showstoppers bug can be corrected. So here is my question, is this bug is a showstopper ?

When tiger introduces generics, some APIs likes java.util, java.lang was retrofited to use them. We are now in a post-tiger world, it seems normal to think that all APIs of mustang use generics but that is not the case, at least the package javax.xml.stream was not apparently generified when included in the source tree of the JDK.

You could answer that is not a big beal, we can generifed it later like we already did with existing APIs.
Yes, we can retrofit the whole package but one class will cause a backward compatibility issue. And guess what, its javax.xml.stream.XMLReader.

XMLReader is an interface that extends Iterator, so each implementation need to provide a code for the method next that returns an Object. Now suppose that someone decides to correct the bug after the official release of mustang. XMLReader will be changed to extend from Iterator<XMLExvent> thus the method next will need to return a XMLEvent and not an Object. It will break all implementations of XMLReader. In short, this bug is not fixable after the release of mustang.

So i ask the question, is bug 6472193 a showstopper ?



Languages Evolution: introduction of new keywords

Posted by forax on October 09, 2006 at 02:22 AM | Permalink | Comments (9)

When you want to add features to a language without breaking backward compatibility, a widespread idea that you can't add new keywords.

That is why we can currently see weird proposal in Java space that try to reuse old keywords to express new kind of abstraction, by example, synchronized (closure v0.2 section 3) or (Neal Gafter blog about for).

Why introducing a new keyword breaks already written codes ?

When you specify a new keyword, you need to change the lexer to recognize sequence of characters as a new token. Thus the lexer doesn't recognize this sequence as an identifier anymore.

One magic solution is to use a special character (or more) for differenciate keyword from identifier. Lot of scripting languages use '$', '#' etc. to tag variables, Perl6 is the best example.
Scripting language use special caracters not only to simplify the lexing process but to help their runtime system to choose between overloaded operations. So adding a new keyword is not a major problem for those languages.

Java is a strong typed language so it doesn't need such special characters and we are stuck while we continue to see lexers as lex. The problem comes from the lexer, so the solution is to change how lexer works.

Contextual keywords

Let me take an example, "enum" is a new keyword introduced in 1.5 to declare enumerated type. So the lexer of an 1.5 compiler now recognize "enum" as a keyword in the whole program.
But in fact, the "enum" that interests a language designer is only needed to recognize "enum" as a keyword in the case of a type declaration not in a block of code.

The solution is to use a lexer that implements contextual keywords, i.e a lexer that let the parser activate or not rules needed to recognize tokens depending on the parser state.

enum Foo {                                 // keyword
  public static void main(String[]) {
    Enumeration enum=... // identifier
  }
}

With two colleagues, i've written a new Parser Generator named Tatoo that generates this kind of lexer.
The tutorial is in french at this time because we haven't lot of time and by our students need it. But a translation will be available soon. Slides in PDF and an article from PPPJ'06 are available in english.

Tatoo contains other innovative features like grammar versioning, full NIO support (push lexer/parser), lexing without unicode decoding, AST generator. I will blog about those features later.



About iterative control abstraction

Posted by forax on October 04, 2006 at 12:11 PM | Permalink | Comments (17)

Neal proposes to use for to tag methods that take a synchronous closure as parameter and to call this new kind of method.

It's better than to use synchronized as proposed before but (there is always a but :) i see two drawbacks.
First, for is used to tag the whole method and not one of its parameter so there is no way to define a method that takes a synchronous closure and an asynchronous one. Second, in Java, for is an instruction and not an expression but synchronous method may return a value.

  int sum=for each(int sum,Integer value:Arrays.asList(2,3),0) {
    sum+value
  }
  ...

Else, there is a bug in signature of the method eachEntry proposed by Neal. He uses the syntax of the closure v0.2 and as i was explained here if you don't use function type, you have to care about subtyping relationship between parametrized types.
The code above is not legal with the signature given by Neal void for eachEntry(Map<K,V> map, Block2<K,V,E> block) throws E.

 Map<String,Droid> map=...
 for eachEntry(CharSequence droidName, Droid droid : map) {
   ...
 }

The type of droidName must be the same than the first type argument of the map, here String and not CharSequence.
The method signature eachEntry must be changed to :

public interface Block2<K,V,throws E> {
    void invoke(K k, V v) throws E;
}
public static <K,V,throws E>
void for eachEntry(Map<K,V> map, Block2<? super K,? super V,E> block) throws E {
    for (Map.Entry<K,V> entry : map.entrySet()) {
        block.invoke(entry.getKey(), entry.getValue());
    }
}

I still prefer the closure v0.1, i think dispite the fact that it introduce a new type (function type), it's simpler to read.
The method each of my first example with the v0.2 is coded like this:

  public interface Expr1<V,R> {
    R invoke(V v,R r);
  }
  public static <R,V>
  R for each(Collection<? extends V> values,R initialValue,
   Expr1<R,? super V> expr) {
    for(V value:values)
      initialValue=expr1.invoke(value,initialValue);
    return initialValue;
  }
and with v0.1 :
  public static <R,V>
  R each(Collection<V> values,R initialValue,R(V v,R R) expr) {
    for(V value:values)
      initialValue=expr1(value,initialValue);
    return initialValue;
  }

Rémi





Powered by
Movable Type 3.01D
 Feed java.net RSS Feeds