Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Parallel session support #41

Open
stuartgreen4j opened this issue Oct 15, 2024 · 0 comments
Open

Parallel session support #41

stuartgreen4j opened this issue Oct 15, 2024 · 0 comments

Comments

@stuartgreen4j
Copy link

Currently only a single store session can write to the Neo4j database at any one time. This presents a big challenge for the loading of large triple stores ( >1TB) to Neo4j.

Neo4j is creating a graph and has to perform a MERGE for every triple (lookup the node to check if it exists and attach the property or relationship to it). The inherent lack of order in the RDF file means that the Neo4j transaction must lock the respective node labels & relationships in order to maintain ACID compliance. This prevents parallel import as the process is dependent on the Resource label that is shared by all nodes in the graph.

Investigations ongoing in the use of multiple Resource labels to cover each RDF concept. This may enable the parallelisation of RDF data load to separate subgraphs that can then be knitted together via node merging.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant