Thursday, November 20, 2008

Call frame introspection on the JVM

One of the pain points of implementing Python on top of the JVM, and in my opinion the worst pain point, is the call frame introspection of Python, and how it is abused throughout Python frameworks. There is in fact a library function sys._getframe() that returns the frame of the current function, or a deeper frame in the call stack if passed an integer representing the depth.

For Jython this mean that we always have to keep a heap allocated call frame object around for each function invocation, and do all access to local variables through this object. The performance implications of this is absolutely terrible. So any improvement to call frame management would greatly improve performance.

Thinking about how to better implement call frame objects in Jython I thought “the JVM already manages call frames, and using a debugger I am able to get access to all the aspects of the JVM call frame that I am interested in to construct a Python frame.”

My first idea on how to implement such a thing was to get under the hood of the JVM and add these capabilities there. Said and done, I checked out OpenJDK and started to patch it to my needs. This actually got me to a working prototype, apart from the fact that I failed to tell the garbage collector that I created extra references to some objects, so they ended up not being equal to null but in every other sense behaving like null. I never tracked down where the problem in my code was, since I found another solution. While I was working on this I found code in the JVM that did things that were surprisingly similar to what I was doing. All of this code was located in a subsystem called JVMTI. I looked further into the documentation of the JVM Tooling Interface, and found that it had been around since JVMv1.5 (commonly referred to as Java 5). This was great news! If I could create a JVM frame introspection library using JVMTI I would have a library that works for all the JVM versions that we target for the next release of Jython.

It took a while before I actually started implementing the frame introspection library, there were other tasks with higher priority, I read up the documentation of JVMTI and there was also the issue with the build process for JNI-libraries being much more painful than for the nice Java stuff I'm used to, since JNI is C code. The last problem I solved over a weekend two weeks ago by implementing a generic make file and an ant task that feeds make with the required parameters of that makefile. This took me a full weekend, since my makefile skills were never that great to begin with, and even rustier than my C skills. I got it to work for my Mac though, and have yet to test it on other platforms (Linux will probably be tested and working before the weekend). This ant task is hosted on Kenai: https://kenai.com/svn/jvm-frame-introspect~svn/jnitask/.

Armed with a good way of building JNI libraries I met up with Charles Nutter in Malmö, where he was speaking at the Øredev conference, for three days. We managed to get the library working while hacking at various coffee shops in Malmö, it should still be polished a bit but I've published it in my personal svn repository on Kenai: https://kenai.com/svn/jvm-frame-introspect~svn/frame/ for everyone who wants to take a look. Trying it out should be as simple as:
svn co https://kenai.com/svn/jvm-frame-introspect~svn/frame/ javaframe
cd javaframe
ant test

The expected output is a few printouts of stack frame content.

The call frame introspection library gives you access to:

  • Access to one call frame at a given depth.
  • Access to the entire stack of call frames for the current thread.
  • Access the stack of call frames (without local variables) for any set of threads, or all threads, in the JVM.
  • Get the reflected method object that the frame is a representation for.
  • Local variables (if this information is added to the class file by javac)
    • The number of local variables.
    • The names of the local variables.
    • The signatures of the local variables.
    • The values of the (currently live) local variables.
    • Locals may be accessed either by name or by index.
      There is also a getThis() method as a shorthand for getting the first local variable, or null if the method is static.
  • Get the current value of the instruction pointer.
  • Get the current line number (if this information is added to the class file by javac).

Update 2008-11-23: I've added support for getting stack traces of call frame snapshots from multiple threads in the JVM. The JVMTI guarantees that the stack traces of all threads are captured at the exact same point of execution. There are method calls for getting traces for a given set of threads or for all JVM threads. These methods do not provide access to the local variables in the frames, since there is no way (apart from suspending the threads) to guarantee that the frame depth is the same at the point of capturing the stack trace as at the point of acquiring the local variables.
I've also made sure that the build scripts work under Linux as well as Mac OS X, and added licensing information (I use the MIT License for this). Also, by request, cleaned up the paragraphs of this entry.

It would be great to get input from my peers on what more information you would like to access from the call frames. The Java bytecode of the method perhaps?
Charles and I also talked about what else we can use JVMTI for, and he was quite enthusiastic about the ability to walk the entire Java object graph, for implementing the objectspace feature of Ruby. One idea would be to write a library that brings the entire functionality of JVMTI to the Java level. The only problem with this would be that a lot of the JVMTI operations don't make much sense unless they are invoked in conjunction with other operations, and that many of the operations are not callback safe, meaning that we cannot allow the execution of arbitrary Java code in between the JVMTI operations. But it should be possible to create an abstraction with a more Java-esque API that performs multiple JVMTI operations at the JNI level.

Another interesting aspect of using JVMTI from Java code is that we can prototype a lot of the functionality that we want the JVM to expose to us directly, and thereby vote with code on what we want the JVM of the future to look like.

I hope you will find this useful!

Update: I have move this project to Kenai, the links have been updated accordingly.