Docker deployment of JanusGraph. To run,
docker-compose up --build
Note that a version of Docker Compose with support for version 3
schemas is required, e.g. 1.15.0
or newer.
Afterwards, you can connect to the local Gremlin shell using
docker-compose exec janus ./bin/gremlin.sh
The python-test
subdirectories contains some simplistic Python scripts to test communication with JanusGraph.
Sources for the Dockerfile
and their surroundings were basically taken straight from Titan setups:
For multiple graphs in Titan (and likely also JanusGraph), follow these links:
- How many graphs can i create in one Titan DB?
- Serving multiple Titan graphs over Gremlin Server (TinkerPop 3)
- One graph in one Titan instance
As per compatibility matrix, the supported Cassandra version is 3.11 and the supported Elasticsearch version is 6.6. This repository uses Scylla instead of Cassandra, and according to the Scylla Cassandra Compatibility matrix we find that Scylla 3.0 is a drop-in replacement for Cassandra 3.11.
The latest commit using Cassandra in this repo is 39c537de03a1bb7a65138b535df1ff003e8c4ec6, if you are interested in that.
This Docker example loads an airline graph and
exposes it as graph g
in scripts/airlines-sample.groovy
.
After opening the Gremlin shell in Docker by running e.g.
docker-compose exec janus ./bin/gremlin.sh
You should be greeted by the Gremlin REPL shell:
\,,,/
(o o)
-----oOOo-(3)-oOOo-----
plugin activated: tinkerpop.server
plugin activated: tinkerpop.hadoop
plugin activated: tinkerpop.utilities
plugin activated: aurelius.titan
plugin activated: tinkerpop.tinkergraph
gremlin>
From here, connect to JanusGraph with a session, then forward all commands
to the remote server using :remote console
(this allows skipping the :>
syntax that's
required otherwise):
:remote connect tinkerpop.server conf/remote.yaml session
:remote console
You will find the airlines data exposed as graph g
. We can inspect the vertex count by running e.g.
g.V().count()
This should return a value of 47
. Note that after restarting, the graph is imported again, resulting in
data duplication. To drop all vertices and edges - and then re-import from scratch - we can run
g.V().drop().iterate()
airlines.io(graphml()).readGraph('data/air-routes-small.graphml')
g.tx().commit()
To build an index over the code
property, run
mgmt = airlines.openManagement()
code = mgmt.getPropertyKey('code')
mgmt.buildIndex('byCodeUnique', Vertex.class).addKey(code).unique().buildCompositeIndex()
mgmt.commit()
airlines.tx().commit()
We can then - for example - get all properties of the vertex with code JFK
:
g.V().has('code', 'JFK').valueMap()
This should return:
==>{code=[JFK], type=[airport], desc=[New York John F. Kennedy International Airport], country=[US], longest=[14511], city=[New York], elev=[12], icao=[KJFK], lon=[-73.77890015], region=[US-NY], runways=[4], lat=[40.63980103]}
We could now run path queries, e.g. find a path between Honolulu International and Houston Hobby and return the airport codes and city names:
g.V().has('code', 'HNL').repeat(out().simplePath()).until(has('code', 'HOU')).path().by(valueMap('code', 'city')).limit(1)
This should return:
==>path[{code=[HNL], city=[Honolulu]}, {code=[DFW], city=[Dallas]}, {code=[HOU], city=[Houston]}]
To leave the shell, type :quit
.
You have to choose the Channelizer to work with, e.g. HttpChannelizer
, WebSocketChannelizer
or WsAndHttpChannelizer
/JanusGraphWsAndHttpChannelizer
.
Using the JanusGraphWsAndHttpChannelizer
channelizer: org.janusgraph.channelizers.JanusGraphWsAndHttpChannelizer
allows for HTTP access to JanusGraph, allowing to e.g. determine 100 - 1
(hint: it's 99
)
curl "http://localhost:8182/?gremlin=100-1"
or running complete queries (URL encoded):
curl http://localhost:8182/?gremlin=g.V().has(%27code%27,%20%27JFK%27).valueMap()
... which is a bit clearer when using a JSON POST
:
curl -X POST http://localhost:8182/ \
-H 'Content-Length: 52' \
-H 'Content-Type: application/json' \
-H 'Host: localhost:8182' \
-d '{ "gremlin": "g.V().has('\''code'\'', '\''JFK'\'').valueMap()" }'
To test connectivity with Python, try out the scripts in the python-test/
directory.
A conda environment is provided in environment.yaml:
conda env create -f environment.yaml
conda activate janusgraph
Try running the gremlinpython example:
python test_gremlin_python.py
This should output:
Hop 1: HNL - Honolulu
Hop 2: DFW - Dallas
Hop 3: HOU - Houston
Note that the aiogremlin example is notoriously broken; that's presumably because the package lags behind the TinkerPop version quite a bit.
With Cypher for Gremlin (Opencypher), you can query Janusgraph using the Cypher query language originating from Neo4j. This repo provides a configuration that installs the required plugins.
Note that while the examples in this section work out of the box, some Java drivers will fail with
serialization issues such as Encountered unregistered class ID: 65536
.
This happens especially in Gremlin- or Cypher-enabled applications that do not register JanusGraph's serializers,
e.g. in the Intellij Graph Database support plugin (see this ticket).
In order to have Cypher support working in those situations, you will need to "undo" Janusgraph specifics by doing the following changes.
In gremlin-server.yaml
, replace
org.janusgraph.channelizers.JanusGraphWsAndHttpChannelizer
withorg.apache.tinkerpop.gremlin.server.channel.WsAndHttpChannelizer
, andorg.janusgraph.graphdb.tinkerpop.JanusGraphIoRegistry
withorg.apache.tinkerpop.gremlin.tinkergraph.structure.TinkerIoRegistryV3d0
.
Then, use the GraphSON v3 serializer, since both Gyro and GraphBinary will require you to register the exact types. After this, you should be good to go.
To use Cypher alongside with Gremlin, connect to the Gremlin console and run:
:plugin use opencypher.gremlin
g = EmptyGraph.instance().traversal(CypherTraversalSource.class).withRemote('conf/remote-airlines.properties')
Next, run your Cypher command using g.cypher()
:
g.cypher('MATCH (p:airport) RETURN p.desc AS name')
You can also mix and match Cypher and Gremlin:
g.cypher('MATCH (p:airport) RETURN p').select('p').by(valueMap().select('desc').project('name')).dedup()
Or only use Gremlin:
g.V().hasLabel('airport').as('p').select('p').by(valueMap().select('desc').project('name')).dedup()
In the Gremlin shell, you can also run Cypher queries directly. To do so, run
:plugin use opencypher.gremlin
:remote connect opencypher.gremlin conf/remote-objects.yaml translate gremlin
Alternatively, server-side translations can be used (note the :remote config alias g airlines
command!):
:plugin use opencypher.gremlin
:remote connect opencypher.gremlin conf/remote-objects.yaml
:remote config alias g airlines
You can then run Cypher commands directly on the remote source:
:> MATCH (p:airport) RETURN p.desc AS name
Note that :remote console
does not work in this case.
The Cypher EXPLAIN
command can be used to inspect the equivalent Gremlin query:
gremlin> :> EXPLAIN MATCH (p:airport) RETURN p.desc AS name
==>[translation:g.V().hasLabel('airport').project('name').by(__.choose(__.values('desc'), __.values('desc'), __.constant(' cypher.null'))),options:[EXPLAIN]]