Cliff Hacks Things.

Sunday, February 10, 2008

Java 6 try/finally compilation without jsr/ret

My day job requires me to be a bit of a JVM geek, so I was poring over the Java Virtual Machine spec recently when I remembered something: In Java 6 and later, the old jsr and ret instructions are effectively deprecated. These instructions were used to build mini-subroutines inside methods. While Java doesn't support nested functions or anything fun like that, it does have the try/finally construct, and these instructions were quite handy for implementing it.

I saw no "official" instructions for compiling finally without jsr, so I investigated it, and thought I'd post the results -- mostly in case I forget them later.

For non-Java folks, some background: a chunk of code that might fail at runtime can be wrapped in a try block. You can then attach handlers for specific types of exceptions/errors to the try block using catch clauses, if you want to respond to specific failure cases. You can also attach a finally block, which will be run at the end, failure or no. You can think of finally as a sort of cleanup block -- which, in practice, is how it's used.

As a result, you wind up with multiple control flow paths that can execute the finally code:

  • try block, successful completion, finally block executes before moving on

  • try block, failure, one or more catch blocks, finally block executes before moving on

  • try block, unhandled failure, finally block executes before unwinding the stack and throwing the error out to a higher level


Java originally compiled this the way most folks would write it by hand: code that gets used multiple places goes in a subroutine. In this case, it's a nested subroutine accessed using jsr.

This unfortunately makes dataflow analysis and type inference of the Java code considerably more complex, for reasons I won't go into here.

So while Java 6 and later JVMs can still understand jsr, tools no longer generate it. Instead, they duplicate the code of the finally block along each path (a transform I've always called tail duplication, but there may be other names). Let's look at a quick example.

This totally contrived Java class plays with an array:

class TryFinally {
public static void main(String[] args) {
int[] a = new int[2];
try {
a[16] = 2;
} catch (ArrayIndexOutOfBoundsException e) {
a[0] = 2;
} finally {
a[1] = 2;
}
}
}


It will always follow the longest code path, because it's written to fail: the try block will execute, followed by the handler, followed by the finally clause.

The disassembled JVM instructions for this method are as follows:

public static void main(java.lang.String[]);
Code:
0: iconst_2 // Create the array
1: newarray int
3: astore_1 // Store it in local 1
4: aload_1 // Set element 16 to 2 (throws)
5: bipush 16
7: iconst_2
8: iastore
9: aload_1 // Begin 'success' finally code
10: iconst_1
11: iconst_2
12: iastore
13: goto 35 // End 'success' finally code
16: astore_2 // Catch block, save the exception...
17: aload_1 // and set a[0] = 2
18: iconst_0
19: iconst_2
20: iastore
21: aload_1 // Catch copy of finally code
22: iconst_1
23: iconst_2
24: iastore
25: goto 35
28: astore_3 // A third copy of finally code!
29: aload_1
30: iconst_1
31: iconst_2
32: iastore
33: aload_3
34: athrow
35: return
Exception table:
from to target type
4 9 16 Class java/lang/ArrayIndexOutOfBoundsException
4 9 28 any
16 21 28 any
28 29 28 any


As you can see in my annotations, we have three copies of the finally code! Why three? The answer is in the three code paths I discussed above, and in the exception table.

In the table we see four exception handlers defined -- but, of course, we only defined one! Why so many?

The first is the one we defined as a catch. The second is an invisible additional catch on the try block for type 'any' -- so any unexpected exceptions are sent to the third copy of the finally code. The third guards the catch block itself; the fourth guards the generated exception handler.

In other words, the compiler has rewritten the Java code into something resembling:

public static void main(String[] args) {
int[] a = new int[2];
try {
a[16] = 2;
a[1] = 2;
} catch (ArrayIndexOutOfBoundsException e) {
a[0] = 2;
a[1] = 2;
} catch (* e) {
a[1] = 2;
throw e;
}
}

As you can see, the finally block has disappeared -- instead, its contents have been duplicated along each code path.

Labels:

452 Comments:

Post a Comment

<< Home