Skip to main content

Yet another closure proposal

Posted by forax on January 16, 2008 at 3:39 AM PST

Everybody comes with its own closure proposal, why not me :)

Unlike BGGA, CICE, or FCM (not entirely sure about FCM),
i don't like the fact a closure is an instance of a class
that implement a method.
I prefer the
John Rose's vision,
closure are at runtime method handler on autonomous
block of code.

There is another thing I don't like in the current BGGA proposal:
its syntax of function type,
{int,int=>int} is not enough Javaish for me.
My proposal comes with a new way to declare a function type

I've separated my proposal in three parts:

  1. function type declaration
  2. how to use function type
  3. closure syntax

and because i'am lenient, i will only provide example and not
a real spec.

Function types declaration

A function type is declared using the 'almost'
keyword function.

// function types declaration
class Ops {

  // function are basically like typedef
  // top-level functions are illegal
  function <T> int comparator(T value1,T value2);

  // here function is not a real keyword, the compiler
  // recognize it as a keyword only in this construction
  function <E> void block(E element);

  // in the function below, E can be instantiated to any
  // object types or any primitive types,
  // it works because function are only declaration with no code.

  // another function
  function <R> R task();
}

How to use function type

Function types are types so they can be used anywhere
a type can be used except as bound of a type variable
or type argument of a type variable.

// usage
import static Ops.*;

class Utils {
  // note: comparator<?> is illegal
  // (wildcard are prohibited for generics functions because
  // they are not needed).
  static <T> void sort(List<T> list,comparator<T> comparator) {
    ...
  }

  static <T> void forEach(Collection<T> c,block<T> block) {
    for(T t:c) {
      // invoke is a magic method that execute the block of code,
      // a special rule in the compiler is able to infer its
      // parameter type.
      block.invoke(t);
    }
  }
}

Closure syntax

Unlike function type syntax of BGGA, i like its closure
syntax so basically my proposed closure syntax is identical.
I have borrow method references from FCM even if
we need to add a special lookup rule (because methods and fields
are not in the same lookup space)
for that in the compiler.


Like BGGA, my proposal include non local transfer
and access to local variable (read and write)
but only for closures and not for any anonymous classes,
a closure is not an anonymous class.
Like any proposal, my proposal allow interface conversion
of function to interropt with legacy codes
but only if closure use final local variable.


At last, return is not allowed in a closure to avoid
stupid puzzlers but a replacement syntax exists.

// at calling site
class Main {
  static int f(Object o1,Object o2) {
    ...
  }

  void g(int value) {
    ...
  }

  public static void main(String[] args) {
    Ops.comparator<Object> c=Main.f;

    // using a primitive type here is legal,
    // it creates a block of code that call g()
    Ops.block<int> b=new Main().g;

    List<String> l=Arrays.asList(args);
    Utils.sort(l,c); // ok contravariance

    // closure syntax, this syntax allow to specify the block of code
    Utils.sort(l,{Charsequence c1, CharSequence c2 =>
      c1.toString().compareTo(c1.toString());
    });

    // function to interface conversion
    java.util.Comparator<Object> javaUtilComparator=c;

    // it's roughly equivalent to
    final Ops.comparator<Object> tempc=c;
    java.util.Comparator<Object> javaUtilComparator=new java.util.Comparator<Object>() {
       public int compare(Object o1,Object o2) {
         return tempc.invoke(o1,o2);
       }
    };
   
    // closure to interface conversion
    java.util.Comparator<String> javaUtilComparator2={String s1, String s2 =>
      -s1.compareTo(s2);
    });
  }

Examples of non-local transfer and local variable mutation.

  // this is not allowed in contrast to BGGA.
  class NotAllowed implements Ops.block {
    ...
  }

  static int counter(Collection<Integer> c) {
    int count=0;
    // bloc of code (closure) can mutate local variable,
    // a function that returns an int is a subtype of a
    // function that returns void.
    Utils.forEach(c, {Integer i=>
      count+=i;
    });
    return count;
  }

  static boolean startsWith(Collection<String> c, String prefix) {
    // closure with non local tranfer,
    // moreover prefix doesn't need to be final
    // because this is not an interface conversion
    Utils.forEach(c, {String s=>
      if (s.startsWith(prefix) {
        return true; // doesn't compile ambiguous
        contains.return true; // ok
      }
    });
  }
}

I wait your comments,


Rémi

Related Topics >>

Comments

Howard, i don't want to simplify inner classes, i want real closures. I want to be able to declare a block of code and to use it as an argument, i don't want to create an object to wrap that code as inner classes actually. do. Take a look to the following code and count the allocations if closures are implemented using inner classes.

int counter=0; List<List<String>> list=... Utils.forEach(list, {List<String> l=> Utils.forEach(l, {String s=> counter++; }); });

If closures are implemented as blocs of codes i.e a new construction in the VM, you don't have this problem (and others).
This construction is needed if you want to implement a decent invokedynamic (see John Rose post), so why not using it to implement real closures.
I don't really see why closures need to be implemented using inner classes.

If you simply make the closure syntax a shorthand for an instance of an anonymous inner class then you can do all your examples more simply:

Comparator<Object> c={Object o1,Object o2=>f(o1,o2);} Block<Integer> b={Integer i=>new Main().g(i); } List<String> l=Arrays.asList(args); Utils.sort(l,c); Utils.sort(l,{Charsequence c1, CharSequence c2 => c1.toString().compareTo(c1.toString()); }); Comparator<Object> javaUtilComparator=c; java.util.Comparator<String> javaUtilComparator2={String s1, String s2 => s1.compareTo(s2); }); class Allowed implements Block<Integer> { ... } static int counter(Collection<Integer> c) { int count=0; Utils.forEach(c, {Integer i=> count+=i; }); return count; } static boolean startsWith(Collection<String> c, String prefix) { Utils.forEach(c, {String s=> if (s.startsWith(prefix) { startsWith.return true; // name the return } }); }

If you want other syntax shortcuts like an implied return then allow for all methods, otherwise it is confusing. Similarly named returns should be allowed anywhere a return is allowed. Using a named return allows nesting.

I have suggested this solution in a couple of blogs with varying syntax:

http://www.artima.com/weblogs/viewpost.jsp?thread=182412 http://www.artima.com/weblogs/viewpost.jsp?thread=220920

@liquid, without the annotation the code is not valid, thus an annotation can't be used here.
@ricky, yes it allows multiple aliases but in my opinion it's not a problem it's the solution. Function types are named and name provide the associated semantics.
By example, the two following methods doesn't accept the same kind of closure but with the BGGA syntax, it's not obvious. List<String> map(List<String> l,{String=>String} projection) {...} String reduce(List<String> l,{String=>String} accumulator) {...} Rémi

From what i understand, if 'function' is not a keyword in other contexts, you could use an annotation in its place ?
Like this ?
@FunctionType int comparator(T value1,T value2);
It kinda looks like the delegate keyword in c# to declare 'function types' (or whatever their word for that is).

Your function types encourage multiple names for what are conceptually single types. I'll use BGGA syntax to explain, because I'm more familiar with it:

If BGGA supported type aliasing, where:

type IntComparator={int, int => int};

created a type IntComparator, that had a single method, invoke, that took two ints and returned an int, then it would be incompatible with:

type MyIntComparator={int, int => int};

Clearly sometimes you want an IntComparator and a MyIntComparator to be compatible, and sometimes you don't (it largely depends on the name you choose). I favour generality over forcing codebases to be incompatible with each other. {int, int => int} says what it means, right there, without needing a name, without needing a lookup.

--as it happens you can do this kind of type alias with BGGA, kind of:

interface IntComparator extends {int, int => int} {}

@hlovatt, recent discussions are about how to create closure with actual JVMs.
I think it's more interresting to see closure implementation in the light of JSR 292.

@nopjn & aberrant,

In my proposal: http://www.artima.com/weblogs/viewpost.jsp?thread=182412

I suggested the method keyword with optional type inference: method RETURN_TYPE ( PARMA_TYPE paramName, PARMA_TYPE paramName ) { ... } // without type inference method ( paramName, paramName ) { ... } // with type inference

Olivier, here 'function' mean function type, not function like in Javascript.
french rulez.

@imichtch, my apologies if my main project these days, Tatoo, is a parser generator so you are right, it's not a "real" project.
I agree with you that function is not the best keyword here, perhaps something like typefun is better.

@abberant, you mix up closures and BGGA function types. {int,int=>int} is a type of a function and {int a,int b => a+b } is a closure, i.e a function.
You prefer the syntax of FCM about closure, why not ? It's ok for me. I prefer here stay focus on the fact that contrary to FCM, BGGA and CICE, closure should be autonous blocks of code and not anonymous class

@Remi,

There is a lot of discussion about the overhead of a closure allocation on: http://groups.google.com/group/jvm-languages

From experience with existing systems, e.g. JRuby, the main problem appears to be that the closures are not easily garbage collected. The allocation overhead itself is negligible. There are currently a variety of proposals to address the issue of GCing a closure, e.g. a class loader with weak references.

Why is it that for some reason the order of the definition is reversed for closures?

Methods are:

RETURN_TYPE methodName ( PARMA_TYPE paramName );

Closures seem to be

{PARMA_TYPE paramName, PARMA_TYPE paramName => RETURN_TYPE }

But when you implement one the stuff after the => is the implementation not the return value.

I prefer the syntax of FCM. They look like methods. I don't have to do any thinking about what order the prams are supposed to be. Plus all the methods that already exist can be used as closures.

I'm still disappointed that none of the proposals show what the return value type should be.

Hi Remi, I wonder whether you have ever worked for "real" projects in field. Please do not take it as abuse but I do not believe that something like your example can ever make it way to the "average" project code. What you propose may be interesting... from theoretical point of view. How would you explain "functions" to your students in context of OO Java (which was initially designed so, remember "main" method compromise)?

@forax Have a look my comment at http://www.jroller.com/scolebourne/entry/closures_comparing_the_core_of
A lot of people know JavaScript, a 'function' keyword would be great.
Or maybe it's a french thing :)
Olivier Allouch

If you think you can erase names from the function definitions because they are already defined in the Javadoc you are kidding yourselves and going to make a real awful cognitive dissonance problem.

Remi, you got your types quite wrong for reduce.

String reduce(List[String] l,{String=>String} accumulator) {...}

should be:
String reduce(List[String] l,{String, String=>String} accumulator) {...}

Please try to find another example.. I'm not being pedantic, I'd like to see a valid one.

"Take a look to the following code and count the allocations if closures are implemented using inner classes"

God, no. Short-lived objects are exactly what the JVM is optimised for. Counting allocations is as useful as counting if statements.

"I prefer here stay focus on the fact that contrary to FCM, BGGA and CICE, closure should be autonous blocks of code and not anonymous class"

BGGA already does that, it just uses anonymous classes because they're already in the type system and encouraged by many APIs. All the talk about non-local return only applies to BGGA for this reason, not to CICE and ARM.

@Forax That we really should employ new JVM capability in the implementation of closures to reduce unnecessary instance/class creation is far more a compelling discussion than the use of a function declaration that diverges from the closure syntax since they share so much in their meaning. I do hope to see a lot of discussion on how to extend the JVM to avoid the new and potentially massive instance creation that will come with any of the other closure proposals. I'd like to see closures in the next language release (and erasure erasure) but to use closures as extensively as I might like could have crippling effects on heap use and performance.

As I have no insights wrt. JVM coding, do you know how the JVM will store and handle your block/closure elements? The main reason, our proposal (FCM) uses the notion of anonymous classes for closures is that the JVM already can handle it. Hence, the implementation is a pure javac thing to "transform" closures into some structure known by the JVM.
Do you think, changing from anonymous classes to some block element will give a major technical/runtime benefit? Despite the syntax, the programmer will not see the difference in Java code. Regarding function types, I cannot see the benefit of being able to giving a type a name, as it surely only is about the functions signature. The semantic of a closure taken as parameter usually is local to the taking method and thus should be defined in the methods Javadoc.
Could you extend your explanation on the alternative to return from within closures? How does one do local returns (i.e., implement a non-void closure).