Skip to main content

What's so Taxing about Return?

Posted by cayhorstmann on April 16, 2007 at 1:36 PM PDT

??? style="float: right; margin-left: 1em;" title="" /> The Dual Role of
return

Many programming languages get along just fine without a
return statement. In Pascal, for example, the return value is
assigned to a dummy variable whose name equals the name of the
function.

function foo(arg: integer): integer;
begin
   (* compute return value *)
   foo := retval;
   (* maybe some more cleanup *)
end;

In Scheme, the body of a function is a sequence of expressions, and the
value of the last expression is returned as the result of the function
call:

(define foo (lambda (arg) expr1 expr2 .... retvalExpr))

And, of course, in assembly, the return value is simply deposited in a
register :-)

movl retval, %eax

In Pascal and Scheme, you return to the caller when you reach the end
of the function. There is no equivalent to the “quick and
dirty”

if (somethingAbnormal) return null;
// more work in the normal case...

This shows that there are really two aspects of returning:

  • what to return
  • when to return

??? style="float: right; margin-left: 1em;" />Closures

A closure is a block of code, packaged up for execution at a later
point. When it executes, all the references to the surrounding code should
just work as if the code had executed in the defining scope.

Here is a typical example, using the BGGA 0.5 syntax (which, like all
proposal syntax, is highly subject to change):

public static void main(String[] args) {
    JFrame frame = new JFrame();
    JButton button = new JButton("Click me!");
    frame.add(button);
    int counter = 0;
    button.addActionListener({ ActionEvent e =>
        counter++;
        frame.setTitle("Clicked " + counter + " times."); });
    frame.pack();
    frame.setVisible(true);
}

When the button is clicked, the closure gets called, the counter
variable in the enclosing scope is updated, and the frame title is set to
reflect the click count.

Wait, there is a problem here. By the time the button is clicked,
main has terminated and the local variable counter is
dead and gone.

Actually, though, the closure will capture a reference to a new
int[1]
containing the counter, and of course, a reference to the
JFrame object.

All this can be done with any of the various closure proposals, by
gussying up inner classes with the ability to capture non-final locals.

Unlike some other closures proposals, BGGA goes further and says that
the closures also need to capture the meaning of execution transfer
statements, i.e. break, continue, and return.
(What about throw? That's never statically typed, so we don't
expect to capture it.)

At first glance, this seems like an odd thing to do. When the action
listener executes a break statement, surely we don't want to go
back in time and revive the main method (at least not until the
proposal to add continuations to Java :-))

But it comes in handy for another use of closures: programmer-provided
control statements. Let's say I want to provide an easy way of iterating
over a matrix:

for each(int i, int j: matrix) { // look, ma, no matrix[i].length!
    if (matrix[i][j] == 0)
        continue;
    . . .
}

This actually means: Pass matrix and the closure

{ int i, int j =>
    if (matrix[i][j] == 0)
        continue;
    . . .
}

to the each method:

public static void each(int[][] a, { int, int => void } block) {
    int i = 0;
    int j = 0;
    for (; i < a.length; j = (j == a[i].length ? 0 : j + 1), i = (i + ((j == 0) ? 1 : 0))) {
        block.invoke(i, j);
    }    
}

Of course, now continue should mean the right thing: continue
after the block, with the next iteration of the for statement.
(Sorry about the tortured logic in the for update; one must use
an assignment, increment, method call, or new expression. I
suppose I could use a closure invocation { => if (j <
a[i].length) j++; else { i++; j = 0; }}.invoke()
...)

??? style="float: right; margin-left: 1em;" />The Point of No Return?

Back to the topic of returns. A closure returns a value (if it has a
result type). For example,

{ int x, int y => Math.max(x, y) }

returns an integer, the max of its parameters. But if a closure contais
a return statement, that means to return from the enclosing
block. For example,

{ int x, int y => return Math.max(x, y); }

is a closure with return type void that, when invoked, causes
its caller to return the max of the parameters (or, presumably, if the
caller can't return an int, throw an exception).

Several commentators to my earlier blog point to this issue as the
Achilles heel of the BGGA proposal.

More unhappily, the closure

{ int x, int y => Math.max(x, y); }

computes the max of its parameters, discards it, and returns no value.

When I heard about that, my gut reaction was fear...the fear of
students queuing up for my office hours.

Ultimately, the culprit is the dual nature of return: yielding
a value, and jumping to the end of the method code. In Pascal or Scheme,
none of this is an issue. These languages have no return (or
break or continue) to worry about.

??? style="float: right; margin-left: 1em;" />Many Happy Returns

Let's try to throw some syntax at this. A BGGA closure body is a
sequence of statements followed by an optional expression. Maybe that's
too subtle. Let's make the return expression more prominent. Something
like

{ int x, int y => int : stats => Math.max(x, y) }

I already see the line outside my office getting shorter. (I suppose it
would also allow early returns from a closure, but I don't want to go
there...That's what got us in trouble in the first place.)

But I have to agree with Stephen Colebourne that there are two entirely
separate use cases here.

When one uses a closure for a control abstraction, the return type must
be void since the closure denotes a statement. And it is pretty
clear that return means to return from the enclosing scope.

When one uses a closure as a callback, to be invoked at a much later
time, does one ever want to capture the enclosing semantics of
return? I don't think so, but I will find out soon enough if I am
wrong...

It would make sense to differentiate these use cases.

In a control abstraction, the programmer doesn't provide an explicit
closure, but the compiler puts together a parameter list and a block.
Conversely, when passing a closure to a callback, the programmer does the
{ ... => ... } thing. So, we can tell them apart.

In the first case, the block can contain return,
break, continue, labeled break, etc. Pile it
on!

In the second case, none of them should be allowed. It's just a syntax
error. That should take care of the line outside my office. Students can
wrestle with the compiler—what gets them is code that compiles and
does the wrong thing.

This is almost the same as the RestrictedFunction interface,
except that you can capture non-final locals. It is also somewhat related
to the distinction in BGGA control abstractions. The for control
abstractions allow break and continue, whereas other
control abstractions don't.

If I understand the FCM/JCA proposal correctly, they have essentially
the same solution. But the meaning of return changes from one use
case to the other. I am not sure that's such a good idea. But again, it's
just syntax.

I am a total amateur at this, of course, as I and the world are sure to
find out from the blog comments in a few hours. But it seems to me that
after the tweaks that are sure to come, BGGA will differ very little from
FCM/JCS, except for the issue of method literals. (These may be nice to
have for other purposes. I'll warm up to them if someone can show me how
they solve my pet peeve: href="http://weblogs.java.net/blog/cayhorstmann/archive/2006/06/say_no_to_prope.html">property
boilerplate.)