Saving and evaluating network paths in Neo4j
A Relationship Thing
The Neo4j graph database is much better suited than relational databases for storing and quickly querying nodes and their mutual relationships. If your circle of friends is not wide enough to warrant a graph-based application, you might just want to inventory your LAN.
Modeling structures like the social graph of Facebook, connections to friends and their acquaintances, or your follower structure on Twitter is surprisingly difficult with traditional databases. Trying to map a network path – easily represented with squiggles and arrows on a whiteboard – with a relational model inevitably leads to performance-hungry join statements, the natural enemy of responsive websites.
The Neo4j [1] graph database natively stores graph models and offers fantastic performance – as long as you don't overcook the complexity of the queries. Its generic storage model consists of nodes and relationships. Both can possess attributes; for example, a node that represents a person could contain a name
field for storing the name or carry a relationship is_friends_with
and its intensity (best_friend
, casual_friend
).
Cypher Query Language
The Neo4j query processor takes inquiries in the SQL-style Cypher language, rummages through the data located in the database, and quickly returns results that Cypher also filters and processes in SQL style (i.e., sort, group, etc.).
After you install the GPL-licensed Neo4j Community Server (there's also a commercial enterprise version), it listens on port 7474 for commands either received via REST or using the newer simple JSON processor. The client can be programmed in several dozen languages, including the CPAN REST::Neo4p module for Perl.
The Debian package offered on the Neo4j site [1] also includes a handy command shell: neo4j-sh
. You can use it to run commands similar to the interactive MySQL client to insert new data into the model and extract stored information via Cypher queries.
Declaratively Powerful
Cypher is, like SQL, declarative: You can specify the results you are looking for, but you don't need to define procedural statements to describe how exactly to find them. Match statements define which data are of interest (e.g., "Find all data" or "find all relations of type is_friends_with
) where clauses then reduce the number of matches; for example, the requesting user may only be interested in people who are 18 years or older.
Subsequent processing steps remodel, sort, or collate the data. Even running further match statements against the results list is permitted, as well as intermediate actions to generate new data on the fly.
The graph of the home network in Figure 1 is intended to illustrate some practical queries. Networks actually represent a popular task for Neo4j with nodes and relations. To determine whether a router can easily reach the open Internet via other nodes, the database often needs to find an open path from A to B via craftily connected nodes. This can cause a performance implosion on relational systems, but can often be tackled with ease using graph databases.
Hand-Reared
For example, to add the router named internal
in Figure 1 to the database and assign it the LAN IP 192.168.2.1, you would just do this in the Neo4j shell:
neo4j-sh (?)$ CREATE (router {name:"internal", lan_ip:"192.168.2.1"});
After creating another new node named merger
for the gateway
relation between the internal
router and its gateway, a Cypher query locates both nodes and defines the connection with Cypher's own ASCII art syntax:
neo4j-sh (?)$ MATCH (a), (b) > WHERE a.name = "internal" and b.name ="merger" > CREATE (a)-[r:gateway]->(b);
The match operation finds two nodes, which it assigns the aliases a
and b
. Because no other search pattern exists in the match clause, this applies to all the nodes in the database. However, the following WHERE
clause restricts the results to two precisely named nodes, and the CREATE
statement uses the syntax -[...]->
to draw an arrow with a name between the identified nodes, thus creating a relation of the gateway
type.
Buy this article as PDF
(incl. VAT)