Cliff Hacks Things.

Friday, January 29, 2010

LPCxpresso: nice hardware made useless by bad software

Remember back in the mid-90s, when many of us used Microchip PICs for embedded control? Remember having to use a shoddy closed-source compiler, which only ran on one operating system/architecture? Remember having to use a particular set of tools, instead of whatever tools you were most comfortable with?

Fast-forward to 2010, and you can re-experience all these limitations with the NXP LPCxpresso board! This evaluation board for the NXP LPC ARM chips is...well, I'll let them speak for themselves:
The LPCXpresso evaluation board is meant to be used only with the Code Red LPCXpresso, or Red Suite IDEs.

This runs only on Windows, of course. Yes, folks, it's like 1995 all over again. Consensus among reverse engineers I've contacted is that the programming/debugging interface is cryptographically secured, to keep you from using it the way you want. (Edit: I've decided there's not enough evidence to support this claim.)

Not that Digikey tells you this before you buy, of course! At the time of this writing, even the LPCxpresso web site makes no mention of this, unless you drill down into the PDF Getting Started Guide. Fortunately, competing parts from both Atmel and ST Microelectronics have support for GCC and cross-platform programmers and debuggers. I'll be using these in my next design and sending the LPCxpresso back to the factory.

Update: Someone claiming to be from Code Red commented below, noting that they do have GCC support, but didn't address the other points (cross-platform compatibility being the main one). I based my initial complaints of "no command line compiler" and "no GCC support" on other users' summaries, since (not having Windows) I couldn't test that. My bad; these claims have been removed.

If Code Red does indeed provide cross-platform support (or OSS tools that could become cross-platform) I'll update this post!

Update 2: They do not, though they suggest I could run Windows in a VM. Missing the point, guys.

To be clear, I'm not a complete noob to embedded ARM systems, or Cortex-M3. Once I decided not to return this proprietary piece of circuit (because I'm terribly stubborn), I tried putting together a toolchain based on CS2009q3, writing a linker script, soldering on a USB connector, implementing the proprietary NXP checksum algorithm, chopping off the debug circuitry that I paid for and would sure like to use, etc. Never quite worked right - got consistent firmware corruption. My point is that I shouldn't have to do any of this stuff, or troll through forums trying to decide which contradictory answer is correct. I'm currently rather enamored with the mbed eval board, which, unlike the LPCxpresso, explicitly supports Mac and Linux out of the box.

Labels: , ,

Tuesday, December 09, 2008

Linear-scan register allocator adapted

I've adapted the SSA Linear-Scan allocator, originally designed by Mössenböck and Pfeiffer for Hotspot, for my Propeller Java compiler.

It has some advantages:

  1. It's simple.

  2. It's fast.

  3. It does a pretty good job.

  4. It can deal with destructive arithmetic ops (as in the Propeller) and pre-colored registers (as in calling conventions) easily.

Handling pre-colored registers well is particularly important to me, since I'm trying to do all lowering and allocation without leaving SSA. This includes the calling conventions. With this algorithm, it's a matter of

  • Before each call, insert extra movs to clone each argument value. Pre-color these clone values with the appropriate registers in the calling convention.

  • After each call, rewrite the return value to go into a temporary, which is pre-colored with the returned-result register, and which gets immediately mov-d to the right value.

  • Rely on the phi-equivalence coalescing algorithm (still adapted from Pereira) to merge the non-interfering values and eliminate the copies.

I've altered Pereira's algorithm to deal with the "join families" that Mössenb&oml;/Pfeiffer generate. (Arguably in doing so I've recreated their coalesce algorithm, but hey.)

Here's a new test function with some interfering values:

public int function(int x) {
int y = 1;
int z = 0;

while (y < x) {
z = y;

return z;

And here's the new output with the new register allocator and dramatically better interference analysis. Note that I don't actually build an interference graph at any point; not sure if this is going to come back to bite me.

Remember that the code runs backwards.

IF_ALWAYS JMPRET prim_return_ret, #prim_return

long @method$:L27312399
long @method$:L11631043
IF_ALWAYS JMPRET prim_jmplt_ret, #prim_jmplt
IF_ALWAYS CMPS lmmr10, lmmr1

long @method$:L10840700
IF_ALWAYS ADD lmmr10, #1
IF_ALWAYS MOV lmmr10, lmmr10
IF_ALWAYS MOV lmmr0, lmmr10

long @method$:L10840700
IF_ALWAYS MOV lmmr0, #0
IF_ALWAYS MOV lmmr10, #1

lmmr0 and lmmr1 are the two function arguments (x and this). We smash this almost immediately, reusing lmmr0 to hold z, since it's the register used to return values from functions.

One of these days it would be fun to take a class in this stuff, I think. These are neat problems.

Labels: , ,

Sunday, December 07, 2008

Adapting propasm for the Pilgrim LMM

Last night, I replaced my Java compiler's hacked code generator with one based around propasm.

Unfortunately, propasm had some baked-in assumptions that gave me some trouble, so I had to rewrite it. The rewrite will go into SVN as soon as it reaches feature-parity. It's a lot simpler now and provides an API for extensible image generation -- which is the feature I needed today.

You see, I'm targeting Phil Pilgrim's adaptation of the LMM, which is the only one with 16-cycle constant execution for register-register instructions. (I developed it independently at Hobee's one night, but it looks like Phil did it first, so as far as I'm concerned it's the Pilgrim LMM.)

The Pilgrim LMM has an unusual property that lets it achieve its speed: it lays code out backwards, from high addresses to low, and uses djnz to update the native and LMM PCs simultaneously. This completely broke poor propasm's brain -- it was designed for single-pass assembly, which of course means that instructions get laid out in the order they're input.

To support this, propasm now generates images in two passes. First, it builds a tree of object describing the image's contents. Second, it walks the tree and generates output. This has the side-effect of allowing optimizations on machine code without requiring a separate object model -- I've implemented a couple simple scalar optimizations over propasm's image objects in my compiler.

My Java compiler plugs a class called rlmm.CodeSegment into propasm's image generation code, which lets it generate the code in execution order (by a CFG walk) but lay it out backwards in memory. Thus, for the sample input from my last post, we now get the following output:

' Start of segment rlmm.CodeSegment@1004901
IF_ALWAYS JMPRET prim_return_ret, #prim_return

long @method$:L10840700
long @method$:L15980197
IF_ALWAYS JMPRET prim_jmplt_ret, #prim_jmplt

long @method$:L29181730

long @method$:L29181730

Remember to read bottom-to-top! (Pardon the excessive IF_ALWAYS and expansion of calls, this output was disassembled with a naïve algorithm.)

Labels: , ,

Saturday, December 06, 2008

Compiling Java for the Propeller

My spare time is pretty limited these days -- work takes up a lot of it. On the up-side, I got promoted.

Over the past few months, some of my spare time has been spent designing a Java compiler that targets the Parallax Propeller. A few hand-coded mockups have convinced me that it's feasible -- the only thing between that and making it a reality are the past 50 years of compiler research, which I never learned. :-)

So, I've spent time surfing Citeseer and gathering data. Between what I've learned and my Propeller experience, I've set some parameters for the first version: it will compile to Large Memory Model code rather than a custom interpreter, and will consume Java class files.

As of this evening I've got a prototype working, and have compiled/assembled/uploaded my first binary. The bullet points are as follows:

  • Converts Java 6 JVML to n-address linear IR using abstract evaluation. Assumes input is verifiable, which lets me skip some checking. (Why Java 6? Because it generates nice frame attributes and doesn't use jsr.)

  • Operates almost exclusively in SSA (actually CSSA).

  • Performs register allocation in SSA form, using a crippled-but-developing version of Pereira's algorithm. I've modified it to deal more gracefully with the Propeller's CISC-like two-address architecture.

  • Provides limited scalar optimizations -- the ones I've gotten around to writing. Copy folding, constant propagation, jump-to-flow-control elimination, etc.

  • Fantastically crappy code generator for now -- literally printf-based with no intelligent layout or peephole optimizations, as you'll see in the samples below.

  • None of the fun stuff, like garbage collection, works yet.

Here's sample input and output from one of my test suites

public class Input0 {
public int function(int x) {
int y = 0;
while (y < x) {
y ++;
return y;

And the output, in propasm syntax (a superset of Parallax):

mov r2, #0
rdlong PC, PC ' inlined jmp
long @@L23063136
add r2, #1
rdlong PC, PC ' inlined jmp
long @@L23063136
' [int r2:1] <- phi ssa.BasicBlock@c4bcdc:[int c0:0] ssa.BasicBlock@18fef3d:[int r2:2]
cmps r2, r1 wc wz
call #prim_lt
long @@L26210109
long @@L25763215
mov r0, r2
call #prim_return

The "call #prim_foo" instructions are calls to LMM kernel traps; plenty of other people have explained Propeller LMM in detail elsewhere, I won't repeat it. My code generator inlines LMM primitives in some cases, as it's done with jump here.

Some details of the calling conventions are apparent in the output: incoming arguments start at r0, and the returned value comes back in r0. These registers are real-live Propeller registers inside the LMM kernel, and I suspect I'll have room for a lot of them -- 32 or more -- so I'm using a RISC-like register calling convention, with link register.

Assuming work doesn't interfere, I'll post more as it develops.

Labels: , ,

Thursday, November 27, 2008

Ubuntu 8.10 on the Acer Aspire One

I've replaced my eee 701 with an Acer Aspire One. Ubuntu works quite well on it, without a lot of the trickery I needed on the eee, and the battery life is stunning (I get upwards of 7 hours after tweaks).

I've created a site with tutorials on getting Ubuntu working optimally on the Aspire One. It includes a complete step-by-step guide for going from installation to awesome, based on the detailed notebook I kept in the first two days I had the machine.

The machine, incidentally, is smaller than the notebook. :-)

Labels: ,

Monday, November 24, 2008

Amazon wrong; Acer Aspire One has no free mini-PCIe slot.

Amazon's product page for the Acer Aspire One claims that the machine has an available mini-PCIe slot for WWAN.

I'm holding it in my hands, and it does not -- the connector isn't soldered in place, just like earlier ones.

This is the first time Amazon's product descriptions have had a significant error in my experience, and I'm pretty pissed.

Monday, May 26, 2008

PropellerForth issue tracker

I've gotten a few blog comments from folks who are using PropellerForth. Yay!

A few comments have reported bugs.

I don't closely follow comments on my blog (though I respond when I have time) -- but I do get a email when someone reports a bug at the PropellerForth issue tracker over on Google Code!

If you're using PropellerForth and find a problem, please report it there. (Please don't post "bugs" like "You haven't released a binary in a few months" or "You haven't imported your Subversion repo." I assure you I'm working on both as time permits, but this isn't my day job.)

Of course, if you're using PropellerForth and haven't found a problem, I'd love to hear about that too! Blog comments are a fine place for that, or send me an email if you can track down my address. ;-)