Cliff Hacks Things.

Saturday, May 27, 2006

The Mongoose syntax evolves, with a comparison to Java

I'm tinkering on a new syntax for Mongoose, which incorporates the type system I've spoken of in the past.

As an example, let's look at a simple Java class, and how we would implement the same class in Mongoose.

This class is called Filter, and comes from some dataflow-style processing code of mine. Basically, this code constructs a pipeline of producers and consumers, each of which does some processing or conversion of the data. Producers will finish when they have completed processing and will produce no further output. This finished status propagates forward through the other stages, and is used at the terminal stage(s) to trigger report generation or aggregation.

The Filter can sit between some number of Producers and some number of Consumers. It tests its input, and if it matches some condition, passes it on.

First, the groundwork. These interfaces implement the Producer/Consumer pattern, with some additional signalling for when processing at each stage has completed.

interface Producer<E> {
void addConsumer(Consumer<E> consumer);
void removeConsumer(Consumer<E> consumer);
boolean isFinished();
}
interface Consumer<E> {
void consume(Producer<E> source, E input);
void producerFinished(Producer<E> producer);


Pretty straightforward. Now, I have a lot of different implementations of Producer, some of which need to inherit from existing classes (generally for legacy reasons). Thus, I package up the basic Producer functionality in a ProducerSupport class, to which Producers can delegate most operations. It has two utility methods, emit and finish, that the owning Producer can call to control its operation.


class ProducerSupport<E> implements Producer<E> {
final Producer<E> owner;
final Set<Consumer<E>> consumers = new CopyOnWriteArraySet<Consumer<E>>();
volatile boolean finished = false;

ProducerSupport(Producer<E> owner) { this.owner = owner; }

void addConsumer(Consumer<E> consumer) { consumers.add(consumer); }
void removeConsumer(Consumer<E> consumer) { consumers.remove(consumer); }
boolean isFinished() { return finished; }

void emit(E output) {
for(Consumer<E> consumer : consumers) consumer.consume(owner, output);
}

void finish() {
finished = true;
}
}


Now, the Filter class. I've cut some corners in this example — for example, it supports only one producer — but bear with me.


class Filter<E> implements Consumer<E>, Producer<E> {
final ProducerSupport<E> ps = new ProducerSupport<E>(this);
final Predicate<E> predicate;

Filter(Predicate<E> predicate) {
this.predicate = predicate;
}

void consume(Producer<E> source, E input) {
if(predicate.matches(input)) emit(input);
}

void producerFinished(Producer<E> producer) {
ps.finish();
}

void addConsumer(Consumer<E> consumer) { ps.addConsumer(consumer); }
void removeConsumer(Consumer<E> consumer) { ps.removeConsumer(consumer); }
boolean isFinished() { return ps.isFinished(); }

static interface Predicate<E> {
boolean matches(E input);
}
}


And we're done. Nowadays, the good news is that most IDEs will write delegate methods (like those above) automatically.

To use Filter, one would write code like the following:

Filter<Foo> filter = new Filter<Foo>(
new Predicate<Foo>() {
boolean matches(Foo input) {
return foo.name.equals("bob");
}
});
previousStage.addConsumer(filter);
filter.addConsumer(nextStage);


Now, the Mongoose version. I'll go through two iterations, the first closer to the Java version.

A couple quick notes on this syntax:

  • For generic types, the formal parameters and invocation type parameters are given in [square brackets].

  • Type annotations are given in <angle brackets>. They follow the argument or slot that they describe. Methods may have an additional annotation describing the return type; if present, it follows the argument list and is separated by the return symbol ^.



If this reminds you of Strongtalk, you're a geek.

Yes, there are two types of brackets in these definitions. Square brackets surround blocks of code; curly brackets surround structural definitions. This may seem annoying, but there's actually a good reason for this (which unfortunately doesn't come up in this example).


protocol Producer[E] {
method addConsumer: consumer <Consumer[E]>.
method removeConsumer: consumer <Consumer[E]>.
method finished ^ <Boolean>.
}
protocol Consumer[E] {
method consume: input <E> from: source <Producer[E]>.
method producerFinished: producer <Producer[E]>.
}


No big difference here. Moving along:


class ProducerSupport[E] {
aspect class:
constructor for: owner [
_owner := owner.
]

aspect instance:
implements Producer[E].

slot _owner <Producer[E]>;
constant.
slot _consumers := Set[Consumer[E]] new;
constant.
slot _finished <Boolean> := false.

method addConsumer: consumer <Consumer[E]> [
_consumers include: consumer.
]

method removeConsumer: consumer <Consumer[E]> [
_consumers remove: consumer.
]

method finished [ ^_finished ].

method emit: output <E>;
private
[
_consumers each: [ :c | c consume: output from: owner ].
]

method finish;
private
[
_finished := true.
]
}


The interesting bits, starting from the top:
Behavior and state in Mongoose classes are separated into two aspects, class and instance. This is equivalent to the static/member divide in languages like Java or C++. The main difference, in Mongoose, is that the two can actually be defined separately, though I haven't done that here.

The method #for: is implemented as a constructor, which is analogous to Java and C++ again, but a bit more rigorously defined and flexible. A constructor, in Mongoose, is a method that is invoked on a class, but executes on an object. More importantly, a constructor is invoked like any other method on the class; callers don't know they're invoking a constructor, and you can change the implementation — for example, to return a singleton instance, or delegate to another class — without changing the callers.

Notice that the owner argument to #for: lacks a type annotation. This is an implementation shortcut; because the argument is being directly stuffed into the slot _owner, which has a type annotation, the type can be inferred. (Callers would see the type annotation when inspecting the class.)

The rest is pretty simple. The words constant and private following slot and method definitions (after the semicolon) are attributes. constant says that a slot, once initialized, will never change; this also allows you to omit the type annotation. As with final fields in Java, the slot must have an explicit initializer, or be set in a constructor. private says that a method is only available to other code within the class or subclasses thereof.

Now, to Filter:


class Filter[E] {
aspect class:
constructor using: predicate <Block[E, Boolean]> [
_predicate := predicate.
]

aspect instance:
implements Producer[E] via _ps.
implements Consumer[E].

slot _ps := ProducerSupport[E] for: self;
constant.
slot _predicate <Block[E, Boolean]>;
constant.

method consume: input <E> from: source <Producer[E]> [
(_predicate evaluate: input)
ifTrue: [ _ps emit: input ].
]

method producerFinished: producer <Producer[E]> [
_ps finish.
]
}


Most of this you've seen above; we have a basic constructor and some instance behavior and state. However, a few points here merit further explanation.

The constructor takes the predicate, which performs the actual filtration. It's typed as a Block, which is a standard Mongoose class describing a block of code (as in Smalltalk).

Notice, in the instance aspect, that we're missing most of the code from the Java implementation. The key is the second line in the aspect:
implements Producer[E] via _ps.

This is called a via, and allows us to explicitly delegate all messages in the Producer protocol to another object — in this case, the object in the slot _ps. Variations on vias allow you to delegate only a portion of a protocol, or a few named messages, but those aren't applicable here.

In our case, the contents of _ps are constant, but we might choose to switch out the object; any messages in the Producer protocol would automatically be forwarded to the new object.

So, we've reimplemented the Java version without some tedious, boilerplate code. How would we use such a class?


filter := Filter[Foo] using: [ :foo | foo name = "bob" ].
previousStage addConsumer: filter.
filter addConsumer: nextStage.


Simply passing in a Block that performs our test is a lot simpler, not to mention more concise, than Java's anonymous class mechanism.

So, we've eliminated a chunk of boilerplate code, and made the class easier to use. However, we can do better. Delegation is all well and good, but in this case, we're delegating to a single object that was specifically designed for delegation.

This pattern smells like Java. There's a better way to do this in Mongoose.


trait ProducerSupport[E] {
implements Producer[E].

slot _consumers := Set[Consumer[E]] new;
constant.
slot _finished <Boolean> := false.

method addConsumer: consumer <Consumer[E]> [
_consumers include: consumer.
]

method removeConsumer: consumer <Consumer[E]> [
_consumers remove: consumer.
]

method finished [ ^_finished ].

method emit: output <E>;
private
[
_consumers each: [ :c | c consume: output from: self ].
]

method finish;
private
[
_finished := true.
]
}


ProducerSupport, in Java, was designed as a reusable unit to provide Producer behavior through delegation. Here, we've reimplemented ProducerSupport as a trait — a reusable chunk of state and behavior that classes can include. The code in ProducerSupport will run within the class that includes it, so we've eliminated the concept of "owner" and the accompanying code — now, the ProducerSupport is the owner, and vice versa.

Let's reimplement Filter using this new trait.

class Filter[E] {
aspect class:
constructor using: predicate <Block[E, Boolean]> [
_predicate := predicate.
]

aspect instance:
includes ProducerSupport[E].
implements Consumer[E].

slot _predicate <Block[E, Boolean]>;
constant.

method consume: input <E> from: source <Producer[E]> [
(_predicate evaluate: input)
ifTrue: [ self emit: input ].
]

method producerFinished: producer <Producer[E]> [
self finish.
]
}


I've highlighted the changes. By including the ProducerSupport trait, we've automatically gained all its behavior — so we no longer need to keep track of a ProducerSupport instance, or delegate using the via. Filter also automatically implements the protocols exposed by its included traits — in this case, Producer.

Filter, at this point, has degenerated into a basic Adapter pattern, converting the language's Block construct for use in the producer-consumer architecture. The class is nearly trivial.


I've tried to demonstrate the utility of the following Mongoose features:

  • Linguistic support for delegation through via.

  • Blocks.

  • Traits.



This new syntax is a work in progress; all the concepts described here work in my current Mongoose implementation, but the compiler is evolving as I work. Any comments, suggestions, or questions are welcome.