The Source for Java Technology Collaboration
User: Password:



Rémi Forax

Rémi Forax's Blog

Yet another closure proposal

Posted by forax on January 16, 2008 at 03:39 AM | Comments (16)

Everybody comes with its own closure proposal, why not me :)

Unlike BGGA, CICE, or FCM (not entirely sure about FCM), i don't like the fact a closure is an instance of a class that implement a method. I prefer the John Rose's vision, closure are at runtime method handler on autonomous block of code.

There is another thing I don't like in the current BGGA proposal: its syntax of function type, {int,int=>int} is not enough Javaish for me. My proposal comes with a new way to declare a function type

I've separated my proposal in three parts:

  1. function type declaration
  2. how to use function type
  3. closure syntax
and because i'am lenient, i will only provide example and not a real spec.

Function types declaration

A function type is declared using the 'almost' keyword function.

// function types declaration
class Ops {

  // function are basically like typedef
  // top-level functions are illegal
  function <T> int comparator(T value1,T value2);

  // here function is not a real keyword, the compiler
  // recognize it as a keyword only in this construction
  function <E> void block(E element);

  // in the function below, E can be instantiated to any
  // object types or any primitive types,
  // it works because function are only declaration with no code. 

  // another function
  function <R> R task();
}

How to use function type

Function types are types so they can be used anywhere a type can be used except as bound of a type variable or type argument of a type variable.

// usage
import static Ops.*;

class Utils {
  // note: comparator<?> is illegal
  // (wildcard are prohibited for generics functions because
  // they are not needed).
  static <T> void sort(List<T> list,comparator<T> comparator) {
    ...
  }

  static <T> void forEach(Collection<T> c,block<T> block) {
    for(T t:c) {
      // invoke is a magic method that execute the block of code,
      // a special rule in the compiler is able to infer its
      // parameter type.
      block.invoke(t);
    }
  }
}

Closure syntax

Unlike function type syntax of BGGA, i like its closure syntax so basically my proposed closure syntax is identical. I have borrow method references from FCM even if we need to add a special lookup rule (because methods and fields are not in the same lookup space) for that in the compiler.
Like BGGA, my proposal include non local transfer and access to local variable (read and write) but only for closures and not for any anonymous classes, a closure is not an anonymous class. Like any proposal, my proposal allow interface conversion of function to interropt with legacy codes but only if closure use final local variable.
At last, return is not allowed in a closure to avoid stupid puzzlers but a replacement syntax exists.

// at calling site
class Main {
  static int f(Object o1,Object o2) {
    ...
  }

  void g(int value) {
    ...
  }

  public static void main(String[] args) {
    Ops.comparator<Object> c=Main.f;

    // using a primitive type here is legal,
    // it creates a block of code that call g()
    Ops.block<int> b=new Main().g; 

    List<String> l=Arrays.asList(args);
    Utils.sort(l,c); // ok contravariance

    // closure syntax, this syntax allow to specify the block of code
    Utils.sort(l,{Charsequence c1, CharSequence c2 =>
      c1.toString().compareTo(c1.toString());
    });

    // function to interface conversion
    java.util.Comparator<Object> javaUtilComparator=c;

    // it's roughly equivalent to
    final Ops.comparator<Object> tempc=c;
    java.util.Comparator<Object> javaUtilComparator=new java.util.Comparator<Object>() {
       public int compare(Object o1,Object o2) {
         return tempc.invoke(o1,o2);
       }
    };
    
    // closure to interface conversion
    java.util.Comparator<String> javaUtilComparator2={String s1, String s2 =>
      -s1.compareTo(s2);
    }); 
  }

Examples of non-local transfer and local variable mutation.

  // this is not allowed in contrast to BGGA.
  class NotAllowed implements Ops.block {
    ...
  }

  static int counter(Collection<Integer> c) {
    int count=0;
    // bloc of code (closure) can mutate local variable,
    // a function that returns an int is a subtype of a
    // function that returns void.
    Utils.forEach(c, {Integer i=>
      count+=i;
    });
    return count;
  }

  static boolean startsWith(Collection<String> c, String prefix) {
    // closure with non local tranfer,
    // moreover prefix doesn't need to be final
    // because this is not an interface conversion
    Utils.forEach(c, {String s=>
      if (s.startsWith(prefix) {
        return true; // doesn't compile ambiguous
        contains.return true; // ok
      } 
    });
  }
}

I wait your comments,
Rémi


Bookmark blog post: del.icio.us del.icio.us Digg Digg DZone DZone Furl Furl Reddit Reddit
Comments
Comments are listed in date ascending order (oldest first) | Post Comment

  • Your function types encourage multiple names for what are conceptually single types. I'll use BGGA syntax to explain, because I'm more familiar with it:

    If BGGA supported type aliasing, where:

    type IntComparator={int, int => int};

    created a type IntComparator, that had a single method, invoke, that took two ints and returned an int, then it would be incompatible with:

    type MyIntComparator={int, int => int};

    Clearly sometimes you want an IntComparator and a MyIntComparator to be compatible, and sometimes you don't (it largely depends on the name you choose). I favour generality over forcing codebases to be incompatible with each other. {int, int => int} says what it means, right there, without needing a name, without needing a lookup.

    --as it happens you can do this kind of type alias with BGGA, kind of:

    interface IntComparator extends {int, int => int} {}

    Posted by: ricky_clarkson on January 16, 2008 at 04:46 AM

  • From what i understand, if 'function' is not a keyword in other contexts, you could use an annotation in its place ?
    Like this ?

    @FunctionType int comparator(T value1,T value2);


    It kinda looks like the delegate keyword in c# to declare 'function types' (or whatever their word for that is).

    Posted by: liquid on January 16, 2008 at 08:32 AM


  • @liquid, without the annotation the code is not valid, thus an annotation
    can't be used here.

    @ricky, yes it allows multiple aliases but in my opinion it's not a problem
    it's the solution. Function types are named and name provide the associated
    semantics.

    By example, the two following methods doesn't accept
    the same kind of closure but with the BGGA syntax, it's not obvious.

    List<String> map(List<String> l,{String=>String} projection) {...}
    String reduce(List<String> l,{String=>String} accumulator) {...}

    Rémi

    Posted by: forax on January 16, 2008 at 11:15 AM

  • If you simply make the closure syntax a shorthand for an instance of an anonymous inner class then you can do all your examples more simply:

    Comparator<Object> c={Object o1,Object o2=>f(o1,o2);}

    Block<Integer> b={Integer i=>new Main().g(i); }

    List<String> l=Arrays.asList(args);
    Utils.sort(l,c);

    Utils.sort(l,{Charsequence c1, CharSequence c2 =>
    c1.toString().compareTo(c1.toString());
    });

    Comparator<Object> javaUtilComparator=c;

    java.util.Comparator<String> javaUtilComparator2={String s1, String s2 =>
    s1.compareTo(s2);
    });

    class Allowed implements Block<Integer> {
    ...
    }

    static int counter(Collection<Integer> c) {
    int count=0;
    Utils.forEach(c, {Integer i=>
    count+=i;
    });
    return count;
    }

    static boolean startsWith(Collection<String> c, String prefix) {
    Utils.forEach(c, {String s=>
    if (s.startsWith(prefix) {
    startsWith.return true; // name the return
    }
    });
    }

    If you want other syntax shortcuts like an implied return then allow for all methods, otherwise it is confusing. Similarly named returns should be allowed anywhere a return is allowed. Using a named return allows nesting.
    I have suggested this solution in a couple of blogs with varying syntax:
    http://www.artima.com/weblogs/viewpost.jsp?thread=182412
    http://www.artima.com/weblogs/viewpost.jsp?thread=220920

    Posted by: hlovatt on January 16, 2008 at 02:46 PM


  • Howard, i don't want to simplify inner classes, i want real closures.
    I want to be able to declare a block of code and to use it as an argument,
    i don't want to create an object to wrap that code as inner classes actually. do.
    Take a look to the following code and count the allocations
    if closures are implemented using inner classes.


    int counter=0;
    List<List<String>> list=...
    Utils.forEach(list, {List<String> l=>
    Utils.forEach(l, {String s=>
    counter++;
    });
    });


    If closures are implemented as blocs of codes i.e a new construction
    in the VM, you don't have this problem (and others).

    This construction is needed if you want to implement
    a decent invokedynamic (see John Rose post),
    so why not using it to implement real closures.

    I don't really see why closures need to be implemented using
    inner classes.

    Posted by: forax on January 16, 2008 at 04:04 PM

  • @Forax

    That we really should employ new JVM capability in the implementation of closures to reduce unnecessary instance/class creation is far more a compelling discussion than the use of a function declaration that diverges from the closure syntax since they share so much in their meaning.

    I do hope to see a lot of discussion on how to extend the JVM to avoid the new and potentially massive instance creation that will come with any of the other closure proposals. I'd like to see closures in the next language release (and erasure erasure) but to use closures as extensively as I might like could have crippling effects on heap use and performance.

    Posted by: talden on January 16, 2008 at 07:24 PM

  • @forax Have a look my comment at http://www.jroller.com/scolebourne/entry/closures_comparing_the_core_of

    A lot of people know JavaScript, a 'function' keyword would be great.

    Or maybe it's a french thing :)

    Olivier Allouch

    Posted by: nopjn on January 17, 2008 at 01:22 PM

  • Hi Remi,
    I wonder whether you have ever worked for "real" projects in field.
    Please do not take it as abuse but I do not believe that something like your example can ever make it way to the "average" project code. What you propose may be interesting... from theoretical point of view.
    How would you explain "functions" to your students in context of OO Java (which was initially designed so, remember "main" method compromise)?

    Posted by: imichtch on January 17, 2008 at 01:44 PM

  • Why is it that for some reason the order of the definition is reversed for closures?

    Methods are:

    RETURN_TYPE methodName ( PARMA_TYPE paramName );

    Closures seem to be

    {PARMA_TYPE paramName, PARMA_TYPE paramName => RETURN_TYPE }

    But when you implement one the stuff after the => is the implementation not the return value.

    I prefer the syntax of FCM. They look like methods. I don't have to do any thinking about what order the prams are supposed to be. Plus all the methods that already exist can be used as closures.

    I'm still disappointed that none of the proposals show what the return value type should be.

    Posted by: aberrant on January 17, 2008 at 02:09 PM

  • @Remi,

    There is a lot of discussion about the overhead of a closure allocation on:

    http://groups.google.com/group/jvm-languages

    From experience with existing systems, e.g. JRuby, the main problem appears to be that the closures are not easily garbage collected. The allocation overhead itself is negligible. There are currently a variety of proposals to address the issue of GCing a closure, e.g. a class loader with weak references.

    Posted by: hlovatt on January 17, 2008 at 03:24 PM


  • Olivier, here 'function' mean function type, not function like in Javascript.

    french rulez.

    @imichtch, my apologies if my main project these days,
    Tatoo,
    is a parser generator so you are right, it's not a "real" project.

    I agree with you that function is not the best keyword here,
    perhaps something like typefun is better.

    @abberant, you mix up closures and BGGA function types.
    {int,int=>int} is a type of a function and {int a,int b => a+b } is a closure,
    i.e a function.

    You prefer the syntax of FCM about closure, why not ?
    It's ok for me. I prefer here stay focus on the fact that contrary to FCM, BGGA
    and CICE, closure should be autonous blocks of code and
    not anonymous class

    Posted by: forax on January 17, 2008 at 03:27 PM

  • @nopjn & aberrant,

    In my proposal:

    http://www.artima.com/weblogs/viewpost.jsp?thread=182412

    I suggested the method keyword with optional type inference:


    method RETURN_TYPE ( PARMA_TYPE paramName, PARMA_TYPE paramName ) { ... } // without type inference
    method ( paramName, paramName ) { ... } // with type inference

    Posted by: hlovatt on January 17, 2008 at 03:30 PM


  • @hlovatt, recent discussions are about how to create closure with
    actual JVMs.

    I think it's more interresting to see closure implementation
    in the light of JSR 292.

    Posted by: forax on January 17, 2008 at 03:36 PM

  • As I have no insights wrt. JVM coding, do you know how the JVM will store and handle your block/closure elements? The main reason, our proposal (FCM) uses the notion of anonymous classes for closures is that the JVM already can handle it. Hence, the implementation is a pure javac thing to "transform" closures into some structure known by the JVM.
    Do you think, changing from anonymous classes to some block element will give a major technical/runtime benefit? Despite the syntax, the programmer will not see the difference in Java code.
    Regarding function types, I cannot see the benefit of being able to giving a type a name, as it surely only is about the functions signature. The semantic of a closure taken as parameter usually is local to the taking method and thus should be defined in the methods Javadoc.
    Could you extend your explanation on the alternative to return from within closures? How does one do local returns (i.e., implement a non-void closure).

    Posted by: stefan_schulz on January 18, 2008 at 05:26 AM

  • Remi, you got your types quite wrong for reduce.

    String reduce(List[String] l,{String=>String} accumulator) {...}

    should be:
    String reduce(List[String] l,{String, String=>String} accumulator) {...}

    Please try to find another example.. I'm not being pedantic, I'd like to see a valid one.

    "Take a look to the following code and count the allocations if closures are implemented using inner classes"

    God, no. Short-lived objects are exactly what the JVM is optimised for. Counting allocations is as useful as counting if statements.

    "I prefer here stay focus on the fact that contrary to FCM, BGGA and CICE, closure should be autonous blocks of code and not anonymous class"

    BGGA already does that, it just uses anonymous classes because they're already in the type system and encouraged by many APIs. All the talk about non-local return only applies to BGGA for this reason, not to CICE and ARM.

    Posted by: ricky_clarkson on January 18, 2008 at 05:41 AM

  • If you think you can erase names from the function definitions because they are already defined in the Javadoc you are kidding yourselves and going to make a real awful cognitive dissonance problem.

    Posted by: i30817 on January 19, 2008 at 06:53 AM





Powered by
Movable Type 3.01D
 Feed java.net RSS Feeds