-
- 3.1. BLOB literals
- 3.2. property extration
- 3.3. semantic comparison
Intelligent Graph Database (migrated from GraiphDB https://github.com/grapheco/graiphdb)
- single machine
- intelligent property graph model
- cypher plus
mvn clean install
cd packaging
mvn package -Pserver-unix-dist
or
cd packaging
mvn package -Pserver-win-dist
this command will create pandadb-server-<version>.tgz
or pandadb-server-<version>.zip
in target
directory.
cd packaging
mvn package -Pserver-jar
this command will create pandadb-server-all-in-one-<version>.jar
in target
directory.
visit https://github.com/grapheco/pandadb-v0.1/releases to get pandadb-v0.1 binary distributions.
unpack pandadb-server-<version>.zip
in your local directory, e.g. /usr/local/
.
cd /usr/local/pandadb-server-<version>
bin/neo4j console
: start a PandaDB serverbin/neo4j start
: start a PandaDB server silently
Once PandaDB is successfully startup, infos will be shown as below:
2020-08-08 04:20:51.309+0000 INFO ======== PandaDB (+Neo4j-3.5.6-BLOB) ========
______ _ _____ ______
(_____ \ | | (____ \ (____ \
_____) )___ ____ _ | | ____ _ \ \ ____) )
| ____/ _ | _ \ / || |/ _ | | | | __ (
| | ( ( | | | | ( (_| ( ( | | |__/ /| |__) )
|_| \_||_|_| |_|\____|\_||_|_____/ |______/
PandaDB Server (ver 0.1.0.20200801)
2020-08-08 04:20:51.317+0000 INFO Starting...
[12:20:51:372] DEBUG ExtendedDatabaseLifecyclePluginsService :: loading database lifecycle plugin: cn.pandadb.database.SemanticOperatorPlugin@2f74900b
[12:20:51:372] DEBUG ExtendedDatabaseLifecyclePluginsService :: loading database lifecycle plugin: org.neo4j.kernel.impl.blob.BlobStoragePlugin@27be17c8
[12:20:51:373] DEBUG ExtendedDatabaseLifecyclePluginsService :: loading database lifecycle plugin: org.neo4j.kernel.impl.blob.RegsterDefaultBlobFunctionsPlugin@2c413ffc
[12:20:51:399] INFO SemanticOperatorPlugin :: loading semantic plugins: /Users/bluejoe/IdeaProjects/pandadb-v0.1/itest/testinput/cypher-plugins.xml
[12:20:51:523] INFO BlobStorage$ :: using batch blob storage: org.neo4j.kernel.impl.blob.BlobStorage$DefaultLocalFileSystemBlobValueStorage@2ecf5915
[12:20:51:650] DEBUG ConfigurationEx :: no value set for blob.storage.file.dir, using default: /Users/bluejoe/IdeaProjects/pandadb-v0.1/itest/./testoutput/testdb/data/databases/graph.db/blob
[12:20:51:650] INFO BlobStorage$DefaultLocalFileSystemBlobValueStorage :: using storage dir: /Users/bluejoe/IdeaProjects/pandadb-v0.1/itest/testoutput/testdb/data/databases/graph.db/blob
2020-08-08 04:20:52.527+0000 INFO Bolt enabled on 0.0.0.0:7687.
2020-08-08 04:20:54.926+0000 INFO Started.
2020-08-08 04:20:56.224+0000 INFO Remote interface available at http://localhost:7474/
For details about starting a PandaDB cluster, please visit Start Instruction.
Please start AIPM-Web server if you want to use the functions about AI algorithm, the details could be found AIPM-Web Instructions.
clients communicate with PandaDB via Cypher
over Bolt protocol.
bin/cypher-shell
: open a PandaDB client to a remote server
Also, you may visit http://localhost:7474
to browse graph data in neo4j-browser
.
in neo4j-browser
, users may input Cypher
commands to query on PandaDB.
return <https://bluejoe2008.github.io/p4.png>
or creating a new node:
create (bluejoe:Person {name: 'bluejoe', mail:'bluejoe2008@gmail.com', photo: <https://bluejoe2008.github.io/p4.png>}) return bluejoe
this command will create a Person node with a BLOB property, which content come from the Web URL. If you like, <file://...>
or <ftp://...>
is ok.
in neo4j-browser
, a BLOB property will be displayed as an image icon:
NOTE: if user/password is required, try default values: neo4j
/neo4j
.
PandaDB enhances Cypher
grammar, naming CypherPlus. CypherPlus allows writing BLOB literals in query commands, also it allows semantic operations on properties, especially BLOB properties.
For more details about BLOB and relevant functions, please visit the Blob Introduction.
BlobLiteral
is defined in Cypher grammar in form of:
<schema://path>
Available schema includes: file, http, https, ftp, ftps, base64.
schema | path | example |
---|---|---|
file | path of local file on pandadb server | file://etc/profile |
http | path of file on remote web server | http://s12.sinaimg.cn/mw690/005AE7Quzy7rL8kA4Nt6b&690 |
https | path of file on remote web server | https://bluejoe2008.github.io/bluejoe3.png |
ftp | path of file on remote FTP server | |
ftps | path of file on remote FTP server | |
base64 | path should be a BASE64 encoding string | base64://dGhpcyBpcyBhbiBleGFtcGxl , represents a string with content this is an example |
Next code illustrates how to use blob in Cypher query:
return <https://bluejoe2008.github.io/bluejoe3.png>
more details, see https://github.com/grapheco/pandadb-v0.1/blob/master/docs/blob.md
neo4j@<default_database>> match (n {name:'bluejoe'}) return n.photo->mime, n.car->width;
+------------------------------+
| n.photo->mime | n.car->width |
+------------------------------+
| "image/png" | 640 |
+------------------------------+
retrieving plate number of the car:
neo4j@<default_database>> match (n {name:'bluejoe'}) return n.car->plateNumber;
+--------------------+
| n.car->plateNumber |
+--------------------+
| "苏B56789" |
+--------------------+
NOTE: some semantic operation requires an AIPM service at 10.0.86.128 (modify this setting in neo4j.conf), if it is unavailable, exceptions will be thrown:
neo4j@<default_database>> match (n {name:'bluejoe'}) return n.car->plateNumber;
Failed connect to http://10.0.86.128:8081
CypherPlus allows semantic comparison on two properties.
Following example query compares two text:
neo4j@<default_database>> return 'abc' :: 'abcd', 'abc' ::jaccard 'abcd', 'abc' ::jaro 'abcd', 'hello world' ::cosine 'bye world';
+--------------------------------------------------------------------------------------------------------+
| 'abc' :: 'abcd' | 'abc' ::jaccard 'abcd' | 'abc' ::jaro 'abcd' | 'hello world' ::cosine 'bye world' |
+--------------------------------------------------------------------------------------------------------+
| 0.9416666805744172 | 0.5 | 0.9416666805744172 | 0.5039526306789696 |
+--------------------------------------------------------------------------------------------------------+
A good idea is to determine if a person appear in another photo:
return <http://s12.sinaimg.cn/mw690/005AE7Quzy7rL8kA4Nt6b&690> ~:0.5 <http://s15.sinaimg.cn/mw690/005AE7Quzy7rL8j2jlIee&690>
import pandadb:connector
dependency first:
<dependency>
<groupId>pandadb</groupId>
<artifactId>connector</artifactId>
<version>0.1.0-SNAPSHOT</version>
</dependency>
use GraphDatabase.driver()
to connect remote PandaDB, just like using neo4j:
val _driver = GraphDatabase.driver(url, AuthTokens.basic(user, pass));
val session = _driver.session()
val result = session...
session.close();
...
An alternative way is to use RemotePandaServer.connect()
:
- def connect(url: String, user: String = "", pass: String = ""): CypherService
it returns a CypherService
which has a set of methods:
-
def queryObjects[T: ClassTag](queryString: String, fnMap: (Record => T)): Iterator[T]
-
def queryObjects[T: ClassTag](queryString: String, params: Map[String, AnyRef], fnMap: (Record => T)): Iterator[T]
-
def execute[T](f: (Session) => T): T;
-
def executeQuery[T](queryString: String, fn: (StatementResult => T)): T;
-
def executeQuery[T](queryString: String, params: Map[String, AnyRef], fn: (StatementResult => T)): T;
-
def executeUpdate(queryString: String);
-
def executeUpdate(queryString: String, params: Map[String, AnyRef])
-
def querySingleObject[T](queryString: String, fnMap: (Record => T)): T
-
def querySingleObject[T](queryString: String, params: Map[String, AnyRef], fnMap: (Record => T)): T
A simple example:
val conn = RemotePandaServer.connect("bolt://localhost:7687", "neo4j", "123");
val (node, name, age, photo) = conn.querySingleObject("match (n) where n.name='bob' return n, n.name, n.age", (result: Record) => {
(result.get("n").asNode(), result.get("n.name").asString(), result.get("n.age").asInt(), result.get("n.photo").asBlob())
});
more example code, see https://github.com/grapheco/pandadb-v0.1/blob/master/itest/src/test/scala/CypherServiceTest.scala
import pandadb:database
dependency first:
<dependency>
<groupId>pandadb</groupId>
<artifactId>database</artifactId>
<version>0.1.0-SNAPSHOT</version>
</dependency>
use GraphDatabase.driver()
to connect local PandaDB, just like using neo4j:
val builder = new GraphDatabaseFactory().newEmbeddedDatabaseBuilder(dbDir);
...
val db = builder.newGraphDatabase();
val tx = db.beginTx();
...
An alternative way is to use object PandaDB
:
- def openDatabase(dbDir: File, propertiesFile: File): GraphDatabaseService
An example of openDatabase
:
val db = PandaDB.openDatabase(new File("./testdb"), new File("./neo4j.conf"));
val tx = db.beginTx();
//create a node
val node1 = db.createNode();
node1.setProperty("name", "bob");
node1.setProperty("age", 40);
//with a blob property
node1.setProperty("photo", Blob.fromFile(new File("./testdata/test.png")));
...
If you are used to CypherService
, you may try the method LocalGraphService.connect()
:
val db = PandaDB.openDatabase(new File("./testdb"), new File("./neo4j.conf"));
val conn = LocalGraphService.connect(db);
//a non-blob
val (node, name, age) = conn.querySingleObject("match (n) where n.name='bob' return n, n.name, n.age", (result: Record) => {
(result.get("n").asNode(), result.get("n.name").asString(), result.get("n.age").asInt())
});
LocalGraphService.connect()
returns a CypherService
too, just like that of RemotePandaServer.connect()
.
more example code, see https://github.com/grapheco/pandadb-v0.1/blob/master/itest/src/test/scala/CypherServiceTest.scala
The configuration file specifies the addresses, listening ports, and other necessary configuration information of each node, BlobValueManager, and AIPM in the distributed environment. The following is the configuration information for configuring a PandaDB cluster. The cluster consists of three nodes. The IP address is replaced by hostname.
cn.pandadb.jraft.enabled=false
cn.pandadb.jraft.server.id=localhost:8081
cn.pandadb.jraft.server.group.id=panda
cn.pandadb.jraft.server.snapshot.enable=true
cn.pandadb.jraft.server.snapshot.interval.seconds=10
cn.pandadb.jraft.server.peers=host1:8081,host2:8082,host3:8083
dbms.security.auth_enabled=false
dbms.connector.bolt.listen_address=:7610
dbms.connector.http.listen_address=:7410
dbms.connector.https.listen_address=:7510
costore.factory=cn.pandadb.costore.InElasticSearchPropertyNodeStoreFactory
costore.enable=false
costore.es.host=es
costore.es.port=9200
costore.es.schema=http
costore.es.scroll.size=1000
costore.es.scroll.time.minutes=10
costore.es.index=test-costore
costore.es.type=nodes
# replace <aipm-url> with actual AIPM URL
aipm.http.host.url=<aipm-url>
blob.storage=org.neo4j.kernel.impl.blob.HBaseBlobValueStorage
blob.storage.hbase.zookeeper.quorum=http://hostname:2181
blob.storage.hbase.auto_create_table=true
blob.storage.hbase.table=PANDADB_BLOB
PandaDB v0.1 is an open source product licensed under GPLv3.
We provide multiple channels to connect you to the community of the PandaDB developers, users, and the general graph academic researchers:
- Our Slack channel
- Mail list
Please review the Contributing Guide and CodeSpec for information on how to get started contributing to the project.