My day job requires me to be a bit of a JVM geek, so I was poring over the Java Virtual Machine spec recently when I remembered something: In Java 6 and later, the old
jsr
and
ret
instructions are effectively deprecated. These instructions were used to build mini-subroutines inside methods. While Java doesn't support nested functions or anything fun like that, it
does have the
try
/
finally
construct, and these instructions were quite handy for implementing it.
I saw no "official" instructions for compiling
finally
without
jsr
, so I investigated it, and thought I'd post the results -- mostly in case I forget them later.
For non-Java folks, some background: a chunk of code that might fail at runtime can be wrapped in a
try
block. You can then attach handlers for specific types of exceptions/errors to the
try
block using
catch
clauses, if you want to respond to specific failure cases. You can also attach a
finally
block, which will be run at the end,
failure or no. You can think of
finally
as a sort of cleanup block -- which, in practice, is how it's used.
As a result, you wind up with multiple control flow paths that can execute the
finally
code:
- try block, successful completion, finally block executes before moving on
- try block, failure, one or more catch blocks, finally block executes before moving on
- try block, unhandled failure, finally block executes before unwinding the stack and throwing the error out to a higher level
Java originally compiled this the way most folks would write it by hand: code that gets used multiple places goes in a subroutine. In this case, it's a nested subroutine accessed using
jsr
.
This unfortunately makes dataflow analysis and type inference of the Java code considerably more complex, for reasons I won't go into here.
So while Java 6 and later JVMs can still understand
jsr
, tools no longer generate it. Instead, they duplicate the code of the
finally
block along each path (a transform I've always called
tail duplication, but there may be other names). Let's look at a quick example.
This totally contrived Java class plays with an array:
class TryFinally {
public static void main(String[] args) {
int[] a = new int[2];
try {
a[16] = 2;
} catch (ArrayIndexOutOfBoundsException e) {
a[0] = 2;
} finally {
a[1] = 2;
}
}
}
It will always follow the longest code path, because it's written to fail: the try block will execute, followed by the handler, followed by the finally clause.
The disassembled JVM instructions for this method are as follows:
public static void main(java.lang.String[]);
Code:
0: iconst_2 // Create the array
1: newarray int
3: astore_1 // Store it in local 1
4: aload_1 // Set element 16 to 2 (throws)
5: bipush 16
7: iconst_2
8: iastore
9: aload_1 // Begin 'success' finally code
10: iconst_1
11: iconst_2
12: iastore
13: goto 35 // End 'success' finally code
16: astore_2 // Catch block, save the exception...
17: aload_1 // and set a[0] = 2
18: iconst_0
19: iconst_2
20: iastore
21: aload_1 // Catch copy of finally code
22: iconst_1
23: iconst_2
24: iastore
25: goto 35
28: astore_3 // A third copy of finally code!
29: aload_1
30: iconst_1
31: iconst_2
32: iastore
33: aload_3
34: athrow
35: return
Exception table:
from to target type
4 9 16 Class java/lang/ArrayIndexOutOfBoundsException
4 9 28 any
16 21 28 any
28 29 28 any
As you can see in my annotations, we have three copies of the finally code! Why three? The answer is in the three code paths I discussed above, and in the exception table.
In the table we see four exception handlers defined -- but, of course, we only defined one! Why so many?
The first is the one we defined as a
catch
. The second is an invisible additional
catch
on the
try
block for type 'any' -- so any unexpected exceptions are sent to the third copy of the
finally
code. The third guards the catch block itself; the fourth guards the generated exception handler.
In other words, the compiler has rewritten the Java code into something resembling:
public static void main(String[] args) {
int[] a = new int[2];
try {
a[16] = 2;
a[1] = 2;
} catch (ArrayIndexOutOfBoundsException e) {
a[0] = 2;
a[1] = 2;
} catch (* e) {
a[1] = 2;
throw e;
}
}
As you can see, the
finally
block has disappeared -- instead, its contents have been duplicated along each code path.
Labels: java