Cliff Hacks Things.

Sunday, February 12, 2006

Variable-arity parameterized types

Language geek that I am, I've been reading up on all sorts of type systems in the past few years.

Mostly, I've been interested in ways to statically typecheck dynamic OO languages, like Self — but I've also been interested in how different OO languages implement parameterized types.

I've noticed a common problem: most languages cannot even describe the types of their own methods. (Strongtalk makes an exception for typing Smalltalk-style blocks, but it's not a general solution.)

Before tackling methods, let's take a reasonably common case. We want a class called Tuple, instances of which contain a fixed number of objects of different, specific, types. So, the base Tuple class could be specialized to represent key-value pairs in a map, or (*gag*) multiple return values from a function. (The astute reader sees another obvious use, but we'll get to that shortly.)

How could we define this Tuple in a generic way? In C++, Java, Eiffel, and most other languages (possibly including Strongtalk, though I'd love to be proven wrong), you can't. The closest you can come is to define a different Tuple class for each length — say, Tuple1<A>, Tuple2<A, B>, Tuple3<A, B, C>, etc. I don't think I'm the only one who looks at this solution and says, "Ewww."

Having learned and rejected Python shortly before I started work on Mongoose, tuples struck me as a useful abstraction, and I wanted Mongoose's type system to describe them. Here's a pseudocode Tuple in pseudo-Mongoose. (Brief syntax note: a generic class is applied to parameters as Class(Parameters), just as a function is applied as function(parameters).)

class Tuple(T+) {
method get(index : Integer) : T[index] { ... }
method set(index : Integer, element : T[index]) { ... }
method positionType(index : Integer) : Class(T[index]) { T[index] }
}


Tuple's formal parameter T+ indicates that the class can be specialized on 1..n classes. The value of T is available as an ordered list, both for typechecking at development time, and for access by specialized subclasses at runtime (as shown in the body of positionType). (Implementation of a variable number of slots to store the elements is hairier, and I've omitted it.)

I went to all this nasty work so that Mongoose could statically type its closures and methods. We wind up with something like the following for the Block object (equivalent to a lambda function with environment closure):

class Block(A*, R) {
method argumentType(position : Integer) : Class(A[position]) { A[position] }
method returnType() : Class(R) { R }
method invoke(parameters : A) : R { ... }
}


The formal parameter A* specifies zero or more classes. These variable-arity parameters are bound to Tuples at runtime (and handled through trickery at compile time). For example, for the type application Block(Integer, Integer, String), A within an instance would have the type Tuple(Integer, Integer).

So, say you have an anonymous block like the following, which takes an Integer and tells you if it's bigger than some constant.


{ x : Integer | x > 42 }


Blocks like this are frequently used in Mongoose and Smalltalk to filter collections and the like. I've typed the block's parameter explicitly here; normally, this can be inferred from context. Integer's published contract — sorry, Mr. Meyer — specification for > says it returns a Boolean, so the return type of the block can be inferred.

So, the type of this expression is Block(Integer, Boolean). If anyone were to inspect its class, they'd find that the return type is Boolean and the parameter list is «Integer» (using the Tuple literal syntax that Americans can't type).

The real key here is that this isn't a special feature for standard library classes like Block — this is a generic mechanism (pun intended) applicable to user classes.

Now, after I designed this section of Mongoose, I learned Dylan. Dylan's ludicrously expressive type system — particularly the notion of restricted types, like "Integers between 2 and 12" — really impressed me. I'm looking at ways to integrate this flexibility into Mongoose's parameterized types, preferably in an elegant fashion. (Fortunately, like C++, the parameter bindings are not necessarily restricted to classes, and are available at runtime.)

Edit in response to comments: yes, restricted types would help me constrain the type-array references in that Mongoose code up there at compile time. This is why I want them (among many other reasons).

0 Comments:

Post a Comment

<< Home