Friday, December 05, 2008

About types in Neo4j

When I first started using Neo4j I wondered "Why is there a RelationshipType and no NodeType?", and as more people are introduced to Neo4j I find that this is a quite common question. And of course an obvious question in the strongly typed single inheritance world of Java. This post was inspired by a discussion on the Neo4j mailing list.

The answer to the question is to be found in the name RelationshipType, it means no more than than the name implies. In particular it does not mean data type. What you are wishing for when you ask for a node type is a way of specifying what properties a node or relationship has, based on its type. Neo4j does not provied any mechanism for this at the core layer. There is a meta model component available that gives you data types for nodes and relationships with verification and all of that if you would like it, but you need to explicitly turn it on.

So what are RelationshipTypes then? A relationship type is a navigational feature of Neo4j. It is used to implement what is known in graph theory as edge-labeled multigraphs. This feature makes it a whole lot easier to navigate through a graph that represents application data. Adding similar labels to nodes would not provide any navigational benefit, which is why Neo4j does not implement such a feature.

It is well worth noticing the RelationshipType can be used to implement data types for both relationships and nodes. In a way that allows nodes to have multiple (union) data types.
The way that you implement this in Neo4j is that the data type of a node is determined by the RelationshipType of the relationship is was reached through. A n1 has reached through a relationship r1 with the RelationshipType Ta is said to have the node data type Da, while the same node n1 reached through another relationship r2 with the RelationshipType Tb is said to have the node data type Db. Although it is the same node the application logic will treat it completely different, accessing different properties (with or without overlap). This is a very powerful feature, and not as easy to implement using a label for the nodes to determine the data type of the node.

The design guide for Neo4j recommends that domain model objects are implemented by wrapping nodes and relationships, and that the wrapper class is determined by the type of the traversed relationship as mentioned above. To make this even more simple the Neo4j community have developed a few components that automate this process. Most notably there is neo-weaver that automatically implements an interface or abstract class by the means of getting and setting properties on a node/relationship or by traversing relationships.

Happy Hacking!

1 comment:

Shriharsh Mishra said...

Nice Article! thanks!