Toolset to produce SPASE RDF and explore the resulting Knowledge Graph.
The SPASE Knowledge is composed of two main parts:
- The SPASE ontology, which is an automatically generated OWL Ontology using the SPASE Base Model XSD file available here. The ontology generation algorithm takes every entity on the SPASE XSD file and turns it into an OWL Class, all the relationships between entities get mapped to owl:ObjectProperties and every literal property of each entity gets mapped to owl:DataTypeProperty, all properties get assigned their corresponding domain and range.
- SPASE RDF Individuals Data, which is an automatically generate TTL file containing RDF that represents the different SPASE resources on the XML files provided by hpde. This RDF complies with the SPASE Ontology.
- Clone this repo and its submodules:
git clone --recurse-submodules -j8 git@github.com:polyneme/topst-spase-rdf-tools.git cd topst-spase-rdf-tools
Decompress the pre-processed data under:
topst-spase-rdf-tools/data/spase.ttl.zip
:
cd data
unzip spase.ttl.zip
cd ..
docker compose build
docker compose up
Open:
- Graph Explorer: https://localhost/explorer
- Jupyter notebook: Check
docker-compose
out for a link like http://127.0.0.1:8888/?token=<token>. Then Open the browser on the given link and open theSPASE RDF Exploration.ipynb
file. - Fuseki: http://localhost:3030
- Java
- Docker (ensure docker memory limit is much larger than 4GB. >16GB recommended.)
- Python 3.8+
-
Install and run Fuseki:
-
Download the latest version of Jena Fuseki from here. You can use:
cd ~/Applications/ curl https://archive.apache.org/dist/jena/binaries/apache-jena-fuseki-<fuseki_version>.zip -o apache-jena-fuseki-<fuseki_version>.zip
Just replace
<fuseki_version>
with the latest version available:curl https://archive.apache.org/dist/jena/binaries/apache-jena-fuseki-4.9.0.zip -o apache-jena-fuseki-4.9.0.zip
-
Unzip the Jena Fuseki package:
unzip apache-jena-fuseki-4.9.0.zip
-
Run Fuseki Server:
cd apache-jena-fuseki-4.9.0 ./fuseki-server
-
Open your browser to check your Fuseki is up and running: http://localhost:3030
-
-
Load the RDF data into a Fuseki dataset:
- Create a dataset by opening Fuseki on a browser and click on the
add one link
: - Name the dataset
spase
and select the dataset type (choose persistent if you plan to re-use this dataset on future runs) and then click create dataset: - Upload the pre-processed data, click on add data > select files and select the spase.owl and spase.ttl files under data, then click on upload all:
- Your new Fuseki dataset should be available under http://localhost:3030/spase
- Create a dataset by opening Fuseki on a browser and click on the
-
Install and run the RDF Exploration Jupyter notebook:
- Go to this repo directory:
cd ~/git/spase-rdf-tools/ # replace with the right location
- Get into the python package for the RDF Tools:
cd spase_rdf_tools
- [Optional] Create and activate a virtual environment:
python3 -m venv venv source venv/bin/activate
- Install python requirements:
pip install -r requirements.txt
- Setup Jupyter extensions for KG Exploration:
jupyter nbextension enable --py --sys-prefix graph_notebook.widgets python -m graph_notebook.static_resources.install python -m graph_notebook.nbextensions.install python -m graph_notebook.ipython_profile.configure_ipython_profile
- Run the Jupyter notebook with the extensions:
jupyter notebook --NotebookApp.kernel_manager_class=notebook.services.kernels.kernelmanager.AsyncMappingKernelManager --ip 0.0.0.0 ./
- Open the Jupyter notebook URL and navigate to the SPASE RDF Exploration notebook.
- For more information on the graph-notebook please check their repository.
- Go to this repo directory:
-
Install and run graph-explorer:
- Clone the graph-explorer repo (a copy of the repo is included here as a git submodule):
git clone https://github.com/aws/graph-explorer/
- Navigate to the graph-explorer directory:
cd graph-explorer
- Build the Docker image:
docker build -t graph-explorer .
- Start the Docker container:
docker run -p 80:80 -p 443:443 --env HOST=localhost graph-explorer
- Go to graph-explorer in your browser by opening https://localhost/explorer (Click on
Advanced > Proceed to localhost
if prompted): - Add a new connection by clicking on plus sign in the top right corner, name the connection
spase
, chooseRDF (Resource Description Framework) - SPARQL
as Graph type, and set the endpoint value tohttp://localhost:3030/spase
: - Synchronise your connection and navigate your graph.
- For more information on graph-explorer, please check their repository.
- Clone the graph-explorer repo (a copy of the repo is included here as a git submodule):
- Python 3.9.0+
- Install python dependencies:
pip install -r requirements.txt
python3 spase_rdf_tools.py --help
Usage: spase_rdf_tools.py [OPTIONS] COMMAND [ARGS]...
Options:
--help Show this message and exit.
Commands:
create-owl Creates OWL Ontology using python module
create-python-model Creates Python model from XSD file using xsdata
download-hpde Downloads and decompress HPDE files from GitHub...
download-spase-schema Downloads SPASE XSD schema file from spase-group...
generate-rdf Creates TTL RDf File using python module to lo...
This is also available as a Jupyter notebook under spase_rdf_tools/SPASE RDF Generation.ipynb
.