Monday, December 22, 2008

On Iterator composition in Java

In my last blog post I mentioned that I would like to see library support for manipulating iterators included in the JDK. I for one, and I know that more people than me, have a set of iterator manipulation classes that I bring along from project to project, usually by copy/paste. This post is an elaboration on what the things are that I think should be included in the JDK, all communicated through code (The actual implementation is not as interesting as the API, and thus left as an exercise for the reader):

package java.util;

import java.util.iter.*;

public final class Iterators {
public <E> Iterable<E> empty() { ... }
public <S,T> Iterable<T> convert(Iterable<S> iterable, Converter<S,T> converter) { ... }
public <T> Iterable<T> upCast(Iterable<? extends T> iterable) { ... }
public <T> Iterable<T> downCast(Class<T> target, Iterable<?> iterable) { ... }
public <T> Iterable<T> append(Iterable<T>... iterables) { ... }
}

// These classes should preferably be reused from somewhere.
package java.util.iter;

public interface Converter<S,T> {
T convert(S source);
}

public interface Filter<E> {
boolean accept(E element);
}

And some sample uses:

package org.thobe.example;

import java.util.Iterators;
import java.util.iter.*;

class UsesIteratorComposition {
private class Something {}
private class Source extends Something {}
private class Target { Target(Source s) {} }

Iterable<Source> source = Iterators.empty();

void conversions() {
Iterable<Target> target = Iterators.convert(source, new Converter<Source,Target>() {
Target convert(Source source) {
return new Target(source);
}
});
}

void casts() {
Iterable<Something> something = Iterators.upCast(source);
Iterable<Source> back = Iterators.downCast(Source.class, something);
}

void append() {
Iterable<Source> source = Iterators.append(source, new ArrayList<Source>(){
{
add(new Source());
add(new Source());
}
});
}
}

Wednesday, December 17, 2008

On Double Checked Locking in Java

You have probably all heard/seen the “Double-Checked Locking is Broken” declaration. And if you haven't I'll just tell you that before the improved memory model of Java 5, double checked locking didn't work as expected. Even in Java 5 and later there are things you need to consider to get it to work properly, and even more things to consider if you want it to perform well.

Joshua Bloch suggests in his book Effective Java, and keeps repeating in several presentations that there is one way to implement double checked locking and that this code snippet should be copied to every place where it's needed. I disagree. Not with the part about there only being one way of doing it, but with the copying part. In my eyes this is perfect use case for an abstract class:

package java.util.concurrent;

public abstract class LazyLoaded<T> {
private volatile T value;
// Get the value, guard the loading by double checked locking
public final T getValue() {
T result = value;
if (result == null) { // First check
synchronized (this) { // Lock
result = value;
if (result == null) { // Second check
result = loadValue();
value = result;
}
}
}
return result;
}
protected abstract T loadValue();
}

This class would then be used like this:

package org.thobe.example;

import java.util.concurrent.LazyLoaded;

class UsesDoubleCheckedLocking {
// The generics even makes it read well:
private final LazyLoaded<Something>() {
@Override protected Something loadValue() {
return new Something();
}
};
// ... your code will probably do something useful here ...
}

Just expressing my opinion. I hope someone important reads this and makes sure that it gets included in Java7. And while you're at it, make sure we get some standardized ways of composing Iterable/Iterators as well...

Friday, December 05, 2008

About types in Neo4j

When I first started using Neo4j I wondered "Why is there a RelationshipType and no NodeType?", and as more people are introduced to Neo4j I find that this is a quite common question. And of course an obvious question in the strongly typed single inheritance world of Java. This post was inspired by a discussion on the Neo4j mailing list.

The answer to the question is to be found in the name RelationshipType, it means no more than than the name implies. In particular it does not mean data type. What you are wishing for when you ask for a node type is a way of specifying what properties a node or relationship has, based on its type. Neo4j does not provied any mechanism for this at the core layer. There is a meta model component available that gives you data types for nodes and relationships with verification and all of that if you would like it, but you need to explicitly turn it on.

So what are RelationshipTypes then? A relationship type is a navigational feature of Neo4j. It is used to implement what is known in graph theory as edge-labeled multigraphs. This feature makes it a whole lot easier to navigate through a graph that represents application data. Adding similar labels to nodes would not provide any navigational benefit, which is why Neo4j does not implement such a feature.

It is well worth noticing the RelationshipType can be used to implement data types for both relationships and nodes. In a way that allows nodes to have multiple (union) data types.
The way that you implement this in Neo4j is that the data type of a node is determined by the RelationshipType of the relationship is was reached through. A n1 has reached through a relationship r1 with the RelationshipType Ta is said to have the node data type Da, while the same node n1 reached through another relationship r2 with the RelationshipType Tb is said to have the node data type Db. Although it is the same node the application logic will treat it completely different, accessing different properties (with or without overlap). This is a very powerful feature, and not as easy to implement using a label for the nodes to determine the data type of the node.

The design guide for Neo4j recommends that domain model objects are implemented by wrapping nodes and relationships, and that the wrapper class is determined by the type of the traversed relationship as mentioned above. To make this even more simple the Neo4j community have developed a few components that automate this process. Most notably there is neo-weaver that automatically implements an interface or abstract class by the means of getting and setting properties on a node/relationship or by traversing relationships.

Happy Hacking!

Thursday, November 20, 2008

Call frame introspection on the JVM

One of the pain points of implementing Python on top of the JVM, and in my opinion the worst pain point, is the call frame introspection of Python, and how it is abused throughout Python frameworks. There is in fact a library function sys._getframe() that returns the frame of the current function, or a deeper frame in the call stack if passed an integer representing the depth.

For Jython this mean that we always have to keep a heap allocated call frame object around for each function invocation, and do all access to local variables through this object. The performance implications of this is absolutely terrible. So any improvement to call frame management would greatly improve performance.

Thinking about how to better implement call frame objects in Jython I thought “the JVM already manages call frames, and using a debugger I am able to get access to all the aspects of the JVM call frame that I am interested in to construct a Python frame.”

My first idea on how to implement such a thing was to get under the hood of the JVM and add these capabilities there. Said and done, I checked out OpenJDK and started to patch it to my needs. This actually got me to a working prototype, apart from the fact that I failed to tell the garbage collector that I created extra references to some objects, so they ended up not being equal to null but in every other sense behaving like null. I never tracked down where the problem in my code was, since I found another solution. While I was working on this I found code in the JVM that did things that were surprisingly similar to what I was doing. All of this code was located in a subsystem called JVMTI. I looked further into the documentation of the JVM Tooling Interface, and found that it had been around since JVMv1.5 (commonly referred to as Java 5). This was great news! If I could create a JVM frame introspection library using JVMTI I would have a library that works for all the JVM versions that we target for the next release of Jython.

It took a while before I actually started implementing the frame introspection library, there were other tasks with higher priority, I read up the documentation of JVMTI and there was also the issue with the build process for JNI-libraries being much more painful than for the nice Java stuff I'm used to, since JNI is C code. The last problem I solved over a weekend two weeks ago by implementing a generic make file and an ant task that feeds make with the required parameters of that makefile. This took me a full weekend, since my makefile skills were never that great to begin with, and even rustier than my C skills. I got it to work for my Mac though, and have yet to test it on other platforms (Linux will probably be tested and working before the weekend). This ant task is hosted on Kenai: https://kenai.com/svn/jvm-frame-introspect~svn/jnitask/.

Armed with a good way of building JNI libraries I met up with Charles Nutter in Malmö, where he was speaking at the Øredev conference, for three days. We managed to get the library working while hacking at various coffee shops in Malmö, it should still be polished a bit but I've published it in my personal svn repository on Kenai: https://kenai.com/svn/jvm-frame-introspect~svn/frame/ for everyone who wants to take a look. Trying it out should be as simple as:
svn co https://kenai.com/svn/jvm-frame-introspect~svn/frame/ javaframe
cd javaframe
ant test

The expected output is a few printouts of stack frame content.

The call frame introspection library gives you access to:

  • Access to one call frame at a given depth.
  • Access to the entire stack of call frames for the current thread.
  • Access the stack of call frames (without local variables) for any set of threads, or all threads, in the JVM.
  • Get the reflected method object that the frame is a representation for.
  • Local variables (if this information is added to the class file by javac)
    • The number of local variables.
    • The names of the local variables.
    • The signatures of the local variables.
    • The values of the (currently live) local variables.
    • Locals may be accessed either by name or by index.
      There is also a getThis() method as a shorthand for getting the first local variable, or null if the method is static.
  • Get the current value of the instruction pointer.
  • Get the current line number (if this information is added to the class file by javac).

Update 2008-11-23: I've added support for getting stack traces of call frame snapshots from multiple threads in the JVM. The JVMTI guarantees that the stack traces of all threads are captured at the exact same point of execution. There are method calls for getting traces for a given set of threads or for all JVM threads. These methods do not provide access to the local variables in the frames, since there is no way (apart from suspending the threads) to guarantee that the frame depth is the same at the point of capturing the stack trace as at the point of acquiring the local variables.
I've also made sure that the build scripts work under Linux as well as Mac OS X, and added licensing information (I use the MIT License for this). Also, by request, cleaned up the paragraphs of this entry.

It would be great to get input from my peers on what more information you would like to access from the call frames. The Java bytecode of the method perhaps?
Charles and I also talked about what else we can use JVMTI for, and he was quite enthusiastic about the ability to walk the entire Java object graph, for implementing the objectspace feature of Ruby. One idea would be to write a library that brings the entire functionality of JVMTI to the Java level. The only problem with this would be that a lot of the JVMTI operations don't make much sense unless they are invoked in conjunction with other operations, and that many of the operations are not callback safe, meaning that we cannot allow the execution of arbitrary Java code in between the JVMTI operations. But it should be possible to create an abstraction with a more Java-esque API that performs multiple JVMTI operations at the JNI level.

Another interesting aspect of using JVMTI from Java code is that we can prototype a lot of the functionality that we want the JVM to expose to us directly, and thereby vote with code on what we want the JVM of the future to look like.

I hope you will find this useful!

Update: I have move this project to Kenai, the links have been updated accordingly.

Thursday, September 11, 2008

Meta post: Querying languages versus APIs

This is a meta post about the differences between having a querying language and an API for accessing a database. I want to analyze the pros and cons of both approaches. I believe there are benefits of both models, and that they both have weaknesses.
These are my initial ideas on what to discuss in the post:
  • An API provides a type safe way of operating with the data model.
  • An API cannot expose query injection weaknesses.
  • Queries to in a query languages are handled completely internally in the engine, and can therefore be optimized more easily.
Please comment with references on querying languages and your opinion on the pros and cons of the two approaches.

Meta post: Introducing the concept of meta posts

I am introducing a new kind of post in this journal. Meta posts. I have a list of posts that I want to write, each of these are going to be long, and take a while to write. Therefore I thought it would be easier for me, and lead to better posts, if I could get comments on them before I write them. That way I can find out what aspects of the topic to discuss as well as get some references that might be useful for the post.

These meta posts will carry the tag meta post and I will greatly appreciate any comment on these posts. One of the key features of the meta posts is that I will keep them short, which is why this post ends here.

Monday, July 14, 2008

My JVM wishlist, pt. 1 - Interface injection

If you've been following what's going on in the JVM-languages community you have probably stumbled across John Roses blog. One of his entries was about interface injection. In short wordings interface injection is the ability to at runtime add an interface to a class that was not precompiled as implementing that interface.
Interfaces are injected at one of 3 situations:
  • When a method of the interface is invoked on an object.
  • When an instanceof check for the interface is performed on an object.
  • When the interface is queried for via reflection (I can see this working with Class#isAssignableFrom, but I have my doubts when it comes to Class#getInterfaces, although I'm sure someone smart will be able to solve this without having to know about all injectable interfaces beforehand).
When any of these occur a class can either have the interface implemented already (in the regular way) or a special static injection method on the interface class is invoked. It is up to this injection method to determine if the given class can implement the interface or not. If it determines that the class can implement the interface any missing methods have to be supplied at that time. John suggests that these methods are to be supplied as method handles. Since method handles, according to the EDR of the InvokeDynamic proposal (JSR 292), can be curried this would make it possible to attach an extra state object to the returned method handle, or to return a different implementation of the interface depending on the class they are injected to. Once an injection method of a specific interfaced has been invoked for a specific class, the injection method will never be invoked for that class again, meaning that once a class has been found to not implement an interface, that information will be final, and once an implementation of an interface has been supplied, this implementation can never be changed. This is important since it will allow the VM to perform all optimizations, such as inlining, as before.

What can this be used for?
As a language implementer on the Java platform I think interface injection would be a blessing. In fact I think it is the one thing that would simplify the implementation of languages on the JVM the most. Any language probably has a base interface (as Java has java.lang.Object and Jython has org.python.core.PyObject), let's be unbiased and call it "MyLangObject" for the sake of the continued discussion. There are two things that make the Java platform great:
  1. There are a lot of really good toolkits an libraries implemented for the Java platform.
  2. There are a lot of great languages for the Java platform in which even more great libraries and toolkits will be developed.
Therefore, if you are implementing a language for the Java platform you would want to interact with all of these libraries and toolkits. Problem is that most of them haven't been designed with your language in mind, and they shouldn't be. If MyLangObject was an injectable interface all you would need to do to be able to integrate with any object from another language would be to just interact with it through the MyLangObject interface, and the injection mechanism would take care of the rest.
The injection mechanism could even be used with the classes within your language. Instead of having a base class supplying the default implementation of the methods of MyLangObject you could let the injection method return the default implementation for your methods.
Or why not use interface injection to support function invocation with different argument counts. Each function in your language would implement a set of call methods, one for each argument count it can be invoked with. Your language would then have a set of injectable Callable interfaces one for each argument count that any function in your language can be invoked with, each with only one call method, with the appropriate number of arguments. These interfaces could be generated at runtime if your language supports runtime code generation. The default implementation of the call method in each Callable interface would of course raise an exception, since the function obviously doesn't support that argument count if it doesn't implement the appropriate method.
Interface injection really does provide a huge set of opportunities.

How could interface injection implement a Meta Object Protocol?
There is a great project initialized by Attila Szegedi of the JVM-languages comminuty to create a common meta object protocol (MOP) for all languages on the JVM. With interface injection this would be a simple task.
  1. Let all objects in your language (that supports the MOP) implement the (probably not injectable) interface java.dyn.SupportsMetaObjectProtocol. An interface with only one method:
    java.dyn.MetaObjectProtocol getMetaObjectProtocol();
    This would return the implementation of the java.dyn.MetaObjectProtocol interface for your particular language.
  2. The java.dyn.MetaObjectProtocol contains methods for getting method handles for each dynamic language construct that the community have agreed to be a good common construct, such as getters and setters for subscript operations. These method handles would come from the actual implementation of them for your particular language, and would therefore benefit from every imaginable optimization you have cooked up for your language.
  3. When the main interface of my language is being injected into a class from your language it finds that your class implements java.dyn.SupportsMetaObjectProtocol and uses that to get the method handles for all dynamic language constructs supported by my language, rebinding them them to the method names used in my language.
And as simple as that interface injection has been used to implement a common ground for all languages on the Java platform with absolutely no overhead. I'm not saying that this is the way to implement a meta object protocol for the Java platform, I am just suggesting one way to do it, someone a lot smarter than me might come up with a much better implementation.

To sum things up: I can't wait until the JVM supports interface injection.

Edit: this post has also been re-posted on Javalobby.

Thursday, July 10, 2008

The state of the advanced compiler

First a disclaimer:
When I say that I will blog regularly, obviously you should not trust me!
I have realized that I don't want to get better at blogging regularly, since I kind of think it's boring and diverts my focus from the more important stuff, the code. But this does not mean that I will not blog more frequently in the future, I might do that, all I am saying is that I will never promise to do more blogging.
Even if I am not blogging on it, the advanced Jython compiler is making progress, just not as fast as I would like it to... So the current state of things in the advanced branch of Jython is that I have pluggable compilers working, so that I can switch which compiler Jython should use at any given time. This enables me to test things more easily, and to evolve things more gradualy.
I am still revising my ideas about the intermediate representation of code, and it is still mostly on paper. My current thinking is that perhaps the intermediate representation should be less advanced than I first had planed. I will look more at how PyPy does this, and then let the requirements of the rest of the compiler drive the need for the IR.
An important change in diraction was made this week. This came from discussion with my mentor, Jim Baker, and from greater insight into how PyPy and Psyco works, form listening to the brilliant people behind these projects at EuroPython. I had originally intended the advanced compiler to do most of it's work up front, and get rid of PyFunctionTable from the Jython code base. The change in direction is the realization that this might not be the best aproach. A better aproach would be to have the advanced compiler work as a JIT optimizer, optimizing on actual observed types, which will probably give us a greater performance boost. This also goes well with the idea that I have always intended of having more specialized code object representations for different kinds of code.
The way responsibilities will be divided in between object kinds in the call chain is:
  • Code objects contain the actual code body that gets executed by something.
  • Function objects contain references to the code objects. This starts out with a single reference to a general code object, then as time progresses, the function gets hit by different types, which will trigger an invocation on the advanced compiler that will create a specialized version of code, that will also be stored in the function for use when that particular signature is encountered in the future.
    The functions also contain the environment for use in the code. This consists of the closure variables of the function, and the global scope used in the function.
  • Frame objects provide the introspection capabilities into running code as needed by the locals() and globals() functions, and pdb and similar tools. There should be a couple of different frame implementations:
    • Full frames. These contain all of function state. Closure variables, locals, the lot. These are typically used with TableCode implementations of code.
    • Generator frames. These are actually divided into two objects. One generator state object, that (as the name suggests) contain the state of the generator. This is most of what a regular frame contains, except the previous frame in the call stack. The other object is an object that supports contains the previous frame in the call stack, and wraps generator state object to provide the frame interface.
    • Lazy frames. These are frames that contains almost nothing. Instead they query the running code for their state. I hope to be able to have them access the actual JVM stack frames for this, in which case they will be really interesting.
    The function object should be responsible for handling the life cycle of the frame objects, but I have not entierly worked out if the creation of the frame object should be up to the code object or the function object. The code object will know exactly wich implementation to choose, but then again, we might want to have different function implementations as well, so it might make sense to the entire responsibility of frames to functions.
  • Global scope objects. The global scope could be a regular dictionary. But if we have a special type for globals (that of course supports the dictionary interface) we can have code objects observing the globals for changes to allow some more aggressive optimizations, such as replacing Python loops (over range) with Java loops (with an int counter), so that the JVM JIT can perform all of its loop unrolling magic on it.
  • Class objects. These are always created at run time, unlike regular JVM classes which are created at compile time. Since classes are defined by what the locals dictionary looks like when the class definition body terminates it is quite hard to determine the actual class, as created at "define time", will look like. Although in most cases we can statically determine what most of the class will look like.
  • Class setup objects. These are to class objects what code objects are to. These contain the code body that defines a class but also a pre-compiled JVM class that contains what the compiler has determined the interface of the class to be.
    Both class objects and class setup object are fairly far into the future though, and will not be part of the initial release of the advanced compiler. They might in fact never be, if I find that there is a better way of doing things before I get there.
The other interesting change that the advanced compiler project will introduces, after these specialized call path objects are of course the actual optimizations that they enable:
  • Type specializations.
    • Primitive type unpacking. This is of course the first, most simple, and most basic type specialization. When we detect that a specific object (often) is of a primitive number type and used extensively, we can generate a version where the number is unpacked from the containing object, and the operations on the number are compiled as primitive number operations.
    • Direct invocation of object types. When we detect more coplicated object oriented types we can determine the actual type of the object and find the actual method body for the method we are invoking and insert direct linkage to that body instead of going through the dispatch overhead.
  • Inlining of builtins. When we detect that a highly used builtin function is extensively used we can inline the body of that builtin, or invoke the builtin function directly without dispatch overhead.
  • Expression inlining. Some expressions, in particular generator expressions, imply a lot of over head, since they generate hidden functions, that are only invoked localy, quite often at the same place as they are defined. In this case we can immediatley inline the actual action of the expression. So that for exampel a Python expression such as this:
    string = ", ".join(str(x) for x in range(14, 23))
    Could be transformed into the following Java snippet:
    StringBuilder _builder = new StringBuilder();
    for (int _i = 14; _i < 22; _i++) {
    _builder.append(_i);
    _builder.append(", ");
    }
    _builder.append(22);
    string = _builder.toString();

    Or in this particular case, perhaps even constant folded...
  • Loop transformation. As in the case above the loop over the Python iterator range can be transformed to a regular JVM loop. This might just be a special case of builtin inlining though.
The combination of the abstraction and indirection in between the function object and code object and these optimizations are exactly the powerful tool that we need to be able to do really interesting optimistic optimizations while maintaining the ability to back out of such decisions, should they prove to have been too optimistic. All in all providing Jython with a fair improvement in execution speed.

So that's a summery on what the current work in progress is. If have i look into my magic 8-ball and try and predict the future I would say that one interesting idea would be to have the compiler be able to persist the optimized code versions so that the next time the program is executed, the previous optimizations are already there. This would in fact be a good way of supporting the "test driven oprimizations" that you might have heard Jim and/or me rant about. So there is defenetly cool stuff going on. I cant wait to write the code, and get it out there, which is why I hereby terminate this blogging session in favour of some serious hacking!

EuroPython 2008

So I've spent 3 days on EuroPython 2008 in Vilnius, Lithuania with my colleges Jim and Frank, Ted from Sun and of course a lot of other people in the European Python community. Most notably we've spent a fair amount of time talking to the PyPy group. Since they are mostly based in Europe they didn't have a large presence at PyCon.
Jim and I did two talks together. The first one was a tutorial about various ways of manipulating and compiling code in Python, and how to do that while staying compatible across Python implementations. The second was a presentation about the internals of Jython, showing how similar Jython is to CPython internally, and walking through how you go about supporting some of the cool features of Python. We also managed to sneak in some slides about our ongoing work with the advanced compiler and where that will take us (more on that in a later post). From where I was sitting it seemed people were interested in what we were talking about, and I think our presentations were fairly well received.
Yesterday we had a meeting with the PyPy group resulting in a list of tasks on which our two teams are to collaborate. I think this was very interesting and I believe both sides will get substantial benefits from this effort. It is also my hope that this list will not be complete, but that we will find more interesting tasks to collaborate around after the completion of these tasks.
The most important task for us at the moment is to get ctypes ported to Jython and the JVM. This is important for the PyPy team as well since it will make their JVM back end more complete. The way we were outlining it the implementation would be a JNI bridge to _rawffi, the part of ctypes that is implemented in C, and then use the same Python level implementation of the API as PyPy does. Another way of doing it would of course be to use JNA, but I actually think the JNI path might be more maintainable in this case, since PyPy still needs the C version of _rawffi for pypy-c.
Personally I am very interested in the PyPy JIT, and I think that my work on the advanced Jython compiler could be very useful for the PyPy team when they start their effort on a JVM back end of their JIT. I also think that I can use a lot of the ideas they have implemented in their JIT project in the advanced compiler.
I will not go into the entire list of collaboration points, since the PyPy blog post liked above does a good job there, but I would also like to mention the effort of sharing test cases, which I think is highly important.
At the moment Jim and I are in the PyPy sprint room here at EP, and we just have some blogging to do before we get our hands dirty with code.

Tuesday, July 08, 2008

Simple stuff with import hooks in Python

Yesterday Jim an I had a tutorial session at EuroPython about dynamic compilation in Python. We brought up the topics of import hooks (PEP 302) since we have successfully used them as an opportunity to create code dynamically. The code example we demonstrated for that was one of the actual import hooks that we had used in a Jython setup. Even if it was no more than 80 lines of code, it might not have been the most accessible example. A few people asked me afterwords if I had a more simple example, so by public request, here is a simple meta_path hook that prevents the user from importing some modules.
import sys

class Restriction(object):
__forbidden = set()
@classmethod
def add(cls, module_name):
cls.__forbidden.add(module_name)
def find_module(self, module_name, package_path):
if package_path: return
if module_name in self.__forbidden:
return self
def load_module(self, module_name):
raise ImportError("Restricted")

sys.meta_path.append(Restriction())
add = Restriction.add
del Restriction
If we walk through this class from the top we first of all have a set containing the names of the modules that we will not allow the user to import and a method for adding modules to that set.
The next method, find_module, is the method invoked by the import subsystem. It is responsible for locating the requested module and return an object capable of loading it. The arguments passed to find_module are the fully qualified module name and, if the module is a sub module of a package, the path where that package was found. If the module cannot be found one should either return None or raise an ImportError. This is a handy way to delegate to the next import hook or the default import behavior. If it returns a loader object on the other hand, no other import mechanism will be attempted, which of course is useful in this case. So in this implementation we return self if the module name is found to be the one of the modules that we want to prevent the user from importing.
The loader object should implement the method load_module, which takes only one argument, the fully qualified module name and returns the corresponding module or raises an ImportError if the import fails. In this case we know that load_module is only ever invoked if we want the import to fail, therefore we always raise an ImportError.
There really isn't much more to it. It should be noted however that this isn't a good way to implement security restrictions in Python, since it is possible for any user code to remove the import hook from sys.meta_path, but I still think it makes for a good introductory example to import hooks.

Happy hacking!

Monday, May 05, 2008

Jython call frames and Dynamic Languages Symposium

The call for paper deadline for Dynamic Languages Symposium was last friday (the 25th), I was hard at work the week up until the 25th, working on different implementations of how to represent call frame objects. Sadly I didn't manage to get all the results we wanted and write a paper about it before the CFP deadline for DLS. It was still good work though, I have a few different implementations going now. I still need to do some more testing, benchmarking and comparisons on the different implementations before I publish the results of that here. After that I was swamped with work during the weekend, and then I went to San Francisco for JavaOne, and didn't get reliable Internet connection until now. This sums up why I didn't publish any blog post last week. Besides these updates there isn't more interesting news for this blog post. There will be more to write about on Friday when Jim and I have done our JavaOne presentation. I will post then with an update on how it went. Over and out.

Friday, April 18, 2008

Jython thesis project

I need to get better at publishing stuff here. I also need to publish the progress of my masters thesis project.
If you know me you already know that I am doing a thesis project related to my work on Jython. If you don't know me, and/or didn't know that before, you know it now.
The tentative title of the thesis is "Supporting dynamic languages on the present and future implementations of the Java Virtual Machine". I am investigating how we could better represent dynamic languages on the JVM, with focus on Python. As part of this I am evaluating the suggested new features of the Da Vinci Machine, and how that project would aid implementations of dynamic languages, Jython in particular. But I am also looking into how to support the benefits of the future platforms on the current versions of the JVM, something I know the entire JVM-languages community is interested in. Furthermore I am also trying to evaluate what the Da Vinci Project has missed, by compiling a wish list of features that the JVM could implement in order to better support dynamic languages. Again, this list is compiled from my findings with Python on the JVM.
I have been working on Jython for about a year now, and I've had the idea of doing my thesis on something related to Jython for quite some time as well. Although it took me some time to define exactly what to do. Mostly because I've had too much fun stuff to do at work. By this post I want to mark an end to that phase. Since of a week ago I have written and submitted my project plan to my examiner, and gotten it accepted. I have also made a few modifications based on the comments I got back.
In the project plan I have stated that I will post weekly updates on the progress of the project here every Friday. So here is the first of those posts.
The most recent achievements, this week, has been that I've set up a computer with the MLVM so that I can start trying out the new features that it has to offer. I have already ported (most of) the compiler I wrote during the Summer of Code to Java. So I have a platform there to continue building this project on.
The most immediate task ahead of me now is writing a paper for the Dynamic Languages Symposium. Me and Jim have some plans for what we would like to publish there, and that is what I am working on at the moment. The compiler doesn't support all the stuff that we want to include in this paper yet though, so I will have to use a slightly slower compiler, with some scheduling restrictions: myself. But hand compiling is a good way to experiment with how to represent code anyway, so I don't mind.

Sunday, March 16, 2008

The next step to increase Python compatability in Jython

Last night at PyCon, at the Jython dinner (thanks goes to Sun for feeding us), we followed up a discussion from earlier during the day about Python and concurrency. We realized that CPython is far ahead of us in this area, and that we are lacking in compatibility. After serious brainstorming (and a few glasses of wine), we decided that introducing a Global Interpreter Lock in Jython would solve the entire issue! I hacked together an implementation, while we were having coffee, that is now in trunk. To ensure that this does not break any old code written for Jython the GIL is added as a future feature. To get GIL support today, check out the Jython trunk, start the interpreter, then type:
from __future__ import GIL
Happy hacking!