Skip to main content

Alternative syntax for BGGA closure

Posted by forax on July 7, 2008 at 6:12 AM PDT

In my last blog entry, i've said it was time to discuss
about the BGGA closure syntax. So here is my proposed syntax.

The closure syntax

There are two parts in a closure syntax, the first one defines the
closure type, the second one defines the expression
(not a statement) of the closure.
By example, with a BGGA closure { int x => x+2 },
the expression part is x+2.
BGGA uses a trick for closure type,
it infers the return type using the
result of the expression so the closure type is
a function that takes an integer and returns an integer.

Why BGGA syntax is not great

In my opinion for multiple reasons:

  1. Closure is delimited by curly braces, but in Java
    curly braces is used for defining blocks and not expressions.

  2. In Java unlike Scala function return type is defined before
    the parameter types but BGGA function type
    define return type after parameter types
    {char,char => boolean}.

  3. As already said => is too close from <= (less than equals).
  4. Mixing parenthesis and curly braces is hard to read,
    anonymous class already do that and it's a pain point
    for beginners (and IDE :)


    new Thread({=> System.println(); }).start()
  5. Curly braces are already used as expression
    to initialize arrays or annotation values.

  6. Function type syntax {int,int=>int}
    is too close from closure one {int x,int y=>x+2};
    but these things are different concepts.

  7. { int => int } and { int ==> int } means
    to different things but syntaxes are too similar and
    i'm not able (yes really) to recall wich one defines
    restricted closure.

  8. BGGA closures are defined as instructions
    followed by an expression, so you can write
             { String s, int n => String r = ""; for ( ; n > 0; n--) { r += s; } r };
          

    which is equivalent to this code
      static String concat(String s, int n) {
        String r = "";
        for ( ; n > 0; n--) {
          r += s;
        }
        return r;
      }
      ...
      #concat(String,int)
          

    Guess which one is more readable.
    In my opinion, closure should be restricted to an expression.



Ok, enough critics. Let's be constructive.

My proposed syntax

  expression closure = closure_parameters  expr
  closure_parameters = '|' parameters lists '|'

Examples:

   RemisClosure = |int a, int b| a + b;
  
   Integer[] primes = { 19, 23, 2, 11, 17, 31, 5, 13 };
   Arrays.sort(primes, |Integer x, Integer y| y.compareTo(x));

The syntax is close to Ruby's one,
as I previously said, closures are expressions because in Java,
x+y is an expression, no need to add parenthesis,
brackets or curly braces.
Parameter types are enclosed by pipe ('|') to avoid confusion
with other braces.

Closure in other languages:

  BGGAClosure       = {int a, int b => a + b };
  GroovyLikeClosure = {int x, int b -> a + b };
  RubyLikeClosure   = { |int a, int b| a + b };
  FanClosure       := |int a, int b -> int| { return a + b }

Gotcha: closure with no parameter type

In Java void is not a parameter type,
so |void| 2+3 is not a valid syntax
for a closure that takes no argument and
return an int.


I see two syntaxes:

  1. || 2+3, even if there is not clash with
    boolean or, it could confuse beginner.

  2. 2+3 this means there is an automatic
    conversion from an expression to closure.
    It adds another pass to the overload resolution algorithm,
    which is already complex because it handles
    boxing, varags and generics.

I am not able to decide between the two
syntaxes, so this remains an open issue.

Function type

In Java, return type is declared before parameter type,
I don't see why function type should break that rule.
So a function type is a return type followed by
parameter types separated by comma and enclosed in parenthesis.
This syntax is very similar to FCM one.

Update ask by Neal, grammar of function type:

  return_type '(' parameter_type_list? ')'

Example:

  f(2, |int x| x+2);
  ...
  static int f(int value, int(int) closure) {
    return closure.invoke(value);
  }
 
  g(|int x| |int y| x+y);
  ...
  static int g(int (int ()) closure) {
    return closure.invoke(2).invoke(3);
  }

  int (char[]) throws IOException read =
    reader#read(char[]);

  int lengthSum=sum(|String s| s.length(), "toto", "tutu");
  ...
  static <V> int sum(int (V) f, V... values) {
    int sum=0;
    for(V value:values) {
      sum+=f.invoke(value);
    }
    return sum;
  }

Closure block

A closure block is a block of instructions
that can be passed to a method
in order to perform control abstraction.


A closure block is declared like this:

  expr = method_call closure_parameters? block_body

Example:

   Reader reader=...
   with(reader) {
     ...
   }
   ...
   public static void with(Closeable closeable) void f() {
     try {
       f.invoke();
     } finally {
       reader.close();
     }
   }

This syntax declares the closure block type
after the method parameters (like in Scala).
The syntax has to be different from a method that takes
a function type as parameter because

  • closure block let you use return, break, continue.
  • a method that takes a function block
    performs exception transparency transparently.

  • a closure block type is not a real type,
    it cannot appear somewhere else.

A little more formally, methods that take a closure block
are declared like that:

  method = method_decl closure_block_decl* method_body

  closure_block_decl = name '(' parameters ')'

Closure block return type is always 'void',
so the syntax ommit it.
The variable containing the closure block is not useable
as left value.


Examples

  static void forEach(int... nums) block(int) {
    for (int n: nums) {
      block.invoke(n);
    }
  }
  ...
  int sum = 0;
  forEach(2,3,4) |int n| {
    sum += n;
  }
  ...
  static boolean hasNegativeValue(int... nums) {
    forEach(nums) |int n| {
      if (n<0)
        return true;
    }
    return false;
  }

I don't think a specific syntax for
loop abstraction worth the need to learn a new syntax.

 void forEachEntry(Map<K,V> map) block(K,V) {
    for(Map.Entry<K,V> entry : map.entrySet()) {
        block.invoke(entry.getKey(), entry.getValue());
    }
}
...
Map<String, String> map =
forEachEntry(map.entrySet()) |String first, String second| {
  ...
}

Conversion between function type and block type

It's possible to pass a function type as argument
of method that declares a closure block type.
The opposite is not true because closure block
can do non local transfer but you can use a cast.

  void invokeInEDT() closure() {
    void() f=closure;          // illegal conversion
    final void() f=((void())closure; // ok

    EventQueue.invokeAndWait(new Runnable() {
      public void run() {
        f.invoke();
      }
    });
  }

This entry is already too long, so i stop here.

I wait your comments.


Rémi

Related Topics >>

Comments

Still confused by this whole mess..I think adding new confusing syntax to an established existing language is setting one's self up for troubles. In each example so far, I still not sure which part is an argument, a return value, an expression/block of code, etc. If done one way in basic Java and another in new Closure based java, that seems counter productive (and a learning/teaching nightmare).

> This closure mess is starting to make Generics looks like a well thought out and cleanly implemented idea.
:)
Rémi

I think that for Java, the FCM proposal has the best syntax: "#(Args args) {/* statements, ... */}", and you have to say return to get your value out (just from the local anonymous method). Though I also recommend "#(Args args) expression" for simple cases. That '#' thing works nicely as an anonymous function where '#method(Type)' works as a method reference instead (in JavaDoc style). BGGA is way far off from the quality of FCM, in my opinion.

You use, but don't define, the syntax for function types. I don't know what name lookup rules you have in mind that would enable using a method name as a closure.

Not that I like exactly all the details of FCM either. We all have different opinions in Java Land.

I disagree that a single expression is enough.

I was wondering about your "expression closure". Can it have more then one line? Also while "void" is not a valid type "Void" is. So the answer to your Gotcha might be |Void| but you might want to think about some sort of abbreviation for that. Something like |.| or |:| It took me a few seconds to figure out the Closure block examples. To me it seems like if you are going to use | for the closure parameter list delimiter then you should use it everywhere you list the parameters for a closure. static int f(int value, int|int| closure) { ..... } static void forEach(int... nums) block|int| { ..... } and void forEachEntry(Map map) block|K,V| { ..... } This way It looks less like the method parameters. How about a colon in between the block definition and the method params? void forEachEntry(Map map) : block|K,V| { ..... } forEachEntry(map.entrySet()) : |String first, String second| { ..... } I still prefer FCM because it looks so much like regular java. I think your syntax though is better then ==> and =>.

This post feels like you are trying to change the syntax from Mars (which is a good thing) to a syntax from Venus (which is not...). In my opinion, the FCM syntax is cleary much better than all the other that have been proposed. It is already familiar to all the developers (through the javadoc syntax). It seems that some of you examples do you FCM syntax in some places, and looking at the few examples you posted just make me think: eeewww... inconsistent....

Neal, i've updated this entry to define function types.
@aberrant, closure is an expression. Else I don't want use '|' everywhere. I want only use it when defining a closure and not when defining a function type or a closure block type. My point is that if you reuse the same symbol to define a closure and a function type, beginners will become confused.

What about the following syntax BGGA: {int, int => int} method = {int x, int y => x + y}; Proposed: #[int : int, int] method = #[ : int x, int y] {x + y}; No unexpected use of {} as it clearly bounds statements or expressions. The order of declaration matches conventional method declaration and still supports declaring a throws clause. The return type can be allowed to be inferred (as indicated) by omitting the return type. I prefer to keep the non-local return support and multiple statement/expression support (and just deal with the effort of learning the new convention) so we can easily still get control abstraction. This is a quick idea - I'd want to see it in some larger examples to see how well it reads amongst other java code.

@npiquet, what i want was to propose a new syntax to BGGA not a different semantic. Method reference '#' firstly described in FCM was included in BGGA prototype. That's why i use it here.

My C3S proposal (http://www.artima.com/weblogs/viewpost.jsp?thread=182412) uses: Method2<Integer, Integer, Integer> m = method(x, y) { x + y }; The reason for using the keyword method is that it is more Java like. Java uses unabbreviated keywords, not symbols, e.g. extends rather than :. Also note the type inference and that method creates an instance of an anonymous inner class, that in the example above implements interface Method2. This would be a small change to Java, some generic Method interfaces and short syntax for anonymous inner class instance creation.

I still have yet to see anyone try to argue against FCM syntax for anonymous methods. I've either heard people say they like it, or they ignore it. (Not that I've read everything everywhere in detail.)

I don't think the anonymous inner method syntax is bad (I dislike the method type declaration though (nesting parentheses are distracting)). The anonymous inner method syntax in FCM avoids the nested parentheses ugliness of the method type declaration since the return type is inferred. I wonder if there are edge cases where this inference might occasionally need some explicit casting in the block to match the expected return type. I would still like to see restricted closures for non-local return support and this could be achieved easily enough with ## instead of # and a non-local return keyword or an extension to the current return keyword syntax. I prefer making the closure syntax more visible though - they do add a layer of complexity to understanding a piece of code that you wouldn't want to overlook (my only concern with control abstraction). Examples... hopefully I have these all correct. BGGA: public {T => U} converter({=> T} a, {=> U} b, {T => U} c) { return {T t => a.invoke().equals(t) ? b.invoke() : c.invoke(t)}; } FCM: public #(U(T)) converter(#(T()) a, #(U()) b, #(U(T)) c) { return #(T t) { return a.invoke().equals(t) ? b.invoke() : c.invoke(t); }; } Proposed (with a minor refinement): public [U # T] converter([T #] a, [U #] b, [U # T] c) { return [# T t] {a.invoke().equals(t) ? b.invoke() : c.invoke(t)}; }

@tompalmer, re. FCM syntax - yes it is clearer than BGGA syntax but still uses a symbol where an unabbreviated keyword is more Java like. @talden, your examples all presumably static methods and should have generic arguments, e.g. BGGA: BGGA: public static <U, T> {T => U} converter({=> T} a, {=> U} b, {T => U} c) { return {T t => a.invoke().equals(t) ? b.invoke() : c.invoke(t)}; } In C3S your example would be: public static <U, T> Method1<U, T> converter(Method0<T> a, Method0<U> b, Method1<U, T> c) { method(t) {a.call().equals(t) ? b.call() : c.call(t)} } Examples like these show a disadvantage of using general arguments, {T => U} in BGGA and Method1<U, T> in C3S, you cannot document intent with names. One of the great advantages with named, as opposed to structural typing, is that you have a place to document what things do. EG I think this is clearer: public static <OUT, IN> Converter<OUT, IN> conditionalConverter( Comparatee<IN> comparatee, Converter<OUT, IN> equal, Converter<OUT, IN> notEqual ) { method( convertee ) { comparatee.call().equals( convertee ) ? equal.call( convertee ) : notEqual.call( convertee ) } } You get a place to document how Converter and Comparatee. C3S and CICE support naming because they use inner classes rather than closures and do not need a second typing mechanism adding to the language. NB I slightly changed the example in the above because I could not think of a use case for your example - perhaps meaningful names would have helped me :) The danger with designing the language on blogs is that it is great for unrealistically short examples and a maintenance nightmare in practice. I think we will be really loosing something if we go heavily down the structurally typed path. Same reason why you don't call all your variables i1, i2, etc. if they are int, there is more to a program than its type!

This closure mess is starting to make Generics looks like a well thought out and cleanly implemented idea. New symbols like -> and => are an eyesore and the intellectual weight they add just doesn't justify the gains. Java isn't other languages and a terse mix of symbol soup is not the direction java needs to go. I'm far more concerned w/maintaining code than saving a few keystrokes writing it. I really hope this doesn't become another Einstein solution to an Elvis problem that Java is becoming infamous for.