Non-local returns and compound statements

By: on January 31, 2011

Smalltalk uses closures all over the place. They’re how control structures are built up, for starters. They replace what compound statements do in other languages.

So let’s look at a simple function in C:

  int foo(int i) {
    if (i < 2) {
      return 1;
    } else {
      return 2;

When you invoke foo(1) you evaluate “1 < 2” (hopefully to true) and evaluate the first compound statement. So let’s look at that compound statement:

  { doSomething(i);
    return 1; }

Looks a bit like a closure closing over i, doesn’t it? Hm, return looks a bit odd there. It doesn’t mean “return from this compound statement” but rather “return from the function that contains this compound statement”. Let’s reword that last bit slightly: “return from the function that executes this compound statement.” The literature refers to this kind’ve thing as a “non-local return”.

Let’s turn those {}s into []s, and return into ^:

  int foo(int i) [
    if (i < 2) [
      ^ 1;
    ] else [
      ^ 2;

Let’s pretend next that if works like in the untyped lambda calculus: evaluating a conditional returns a function that takes two parameterless functions (thunks) as parameters. If the conditional evaluates to “true” the function it returns evaluates the first thunk. Otherwise the conditional returns a function that evaluates the second thunk. Let’s call the conditional function ifTrueIfFalse.

Since we’re talking about the untyped lambda calculus, let’s throw out the manifest typing:

  foo(i) [
    ifTrueIfFalse((i < 2),
                  [ doSomething(i); ^ 1; ],
                  [ doSomethingElse(i); ^ 2; ]);

Let’s attach this function to a namespace. We’ll call this namespace “self”, and fully qualify everything. doSomething() and doSomethingElse() belong to this namespace. ifTrueIfFalse() belongs to the conditional namespace (say, Boolean):

  foo(i) [
    Boolean ifTrueIfFalse((i < 2),
                          [ self doSomething(i); ^ 1; ],
                          [ self doSomethingElse(i); ^ 2; ]);

Let’s use keyword parameters, rather than positionals. We don’t need ()s because we know how many parameters each function takes. One extra trick: since conditionals evaluate to a Boolean, we might as well put the first parameter (the conditional itself) before the function call, similar to a C# extension:

  foo: i
    (i < 2)
      ifTrue: [
        self doSomething: i.
        ^ 1]
      ifFalse: [
        self doSomethingElse: i.
        ^ 2]

We’ve just converted a C function into a Smalltalk method. So what does “^ someValue” mean?

Let’s construct a class Foo. It has a method #bar: that looks like this:

  bar: aBlock
    Transcript showln: '#bar started'.
    aBlock value.
    Transcript showln: '#bar finished'.

(Transcript is much like stdout. It’s a window in Squeak to which you send text. #showln: corresponds to println().)

Let’s see what happens.

  | o |
  o := 'some string'.
  Foo new bar: [o].

Transcript shows:

  #bar started
  #bar finished

pretty much as we expected. Next we try a block that explicitly returns something [1]:

  | o |
  o := 'some string'.
  Foo new bar: [^ o].

Transcript shows:

  #bar started

Oops! That “^ o” doesn’t mean “return o from the block” but “return o from the method that invoked the block”.

And so to the crux of the matter: blocks and ^ function much like {}s and return does in C. That probably explains why I found them so natural to use (with my Delphi background), and why someone coming from a Lisp background might boggle at ^ being a non-local return.

[1] All blocks return a value, namely the value of the last statement executed, just like LAMBDA in Common Lisp. A method with no explicit return implicitly returns self.


Leave a Reply

Your email address will not be published.

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>