Cliff Hacks Things.

Tuesday, March 07, 2006

On Unicode.

I'm a big fan of Unicode.

Well, not Unicode specifically, but character sets that aren't annoyingly limited. Watching software at my last job fumble around with accented characters — much less Kanji — really clubbed this into my head.

Mongoose, since M1 in 2004, has supported Unicode throughout. I'm continuing this in the M2 implementation.

Mostly, it's for clarity. I consider ≠ to be a lot clearer than != or /= or .ne. or whatever other hacks other languages have used. Same with ≤ and the like. As of tonight, the current compiler and runtime support all this — Object understands the ≠ message and does the right thing, and so on. Unicode handles such characters quite well: there's a reserved block (U+2200 through U+22FF) for operators. Any of these characters are now legal Mongoose operators, from the compiler on up.

This gets a lot of grumbling from the old guard, but I'm not really open to grumbling on such matters. Any Mac user can type all these symbols directly from their keyboards, without having to enter hex codes or some nonsense. I've included message synonyms for the keyboard-impaired (!=, <=, etc.), but they're just that: synonyms for the real message.

I've been thinking, over the past few years, about ALGOL syntax. ALGOL originally defined a 'pretty' syntax — the print syntax, iirc — that used subscripts for array access and the like. This was well outside the capability of machines at the time, of course, so the syntax that lived on in its descendants used A[2] instead of A2.

I think we're getting to the point where such print syntaxes could be made official. A List might use any subscripted expression as an element accessor; a Number might use a superscripted expression as an exponent. As long as it's easy to enter (which, right now, it's not), I think this could be a win.

But that's because I'm insane.

Mongoose is unlikely to support any such thing, since it's the dreaded "orthogonal syntax feature" I'm avoiding. I like that the entire language syntax can be described in a paragraph, including every possible way a method might be invoked.


Post a Comment

<< Home