Skip to content

Commit

Permalink
Merge pull request #4692 from ntisseyre/inline_index_props
Browse files Browse the repository at this point in the history
Inlining vertex properties into a CompositeIndex structure
  • Loading branch information
ntisseyre authored Oct 27, 2024
2 parents 872a475 + 5aa68f5 commit 213b754
Show file tree
Hide file tree
Showing 31 changed files with 1,003 additions and 113 deletions.
17 changes: 17 additions & 0 deletions docs/changelog.md
Original file line number Diff line number Diff line change
Expand Up @@ -98,6 +98,23 @@ For more information on features and bug fixes in 1.1.0, see the GitHub mileston
* [JanusGraph zip](https://github.com/JanusGraph/janusgraph/releases/download/v1.1.0/janusgraph-1.1.0.zip)
* [JanusGraph zip with embedded Cassandra and ElasticSearch](https://github.com/JanusGraph/janusgraph/releases/download/v1.1.0/janusgraph-full-1.1.0.zip)

##### Upgrade Instructions

##### Inlining vertex properties into a Composite Index

Inlining vertex properties into a Composite Index structure can offer significant performance and efficiency benefits.
See [documentation](./schema/index-management/index-performance.md#inlining-vertex-properties-into-a-composite-index) on how to inline vertex properties into a composite index.

**Important Notes on Compatibility**

1. **Backward Incompatibility**
Once a JanusGraph instance adopts this new schema feature, it cannot be rolled back to a prior version of JanusGraph.
The changes in the schema structure are not compatible with earlier versions of the system.

2. **Migration Considerations**
It is critical that users carefully plan their migration to this new version, as there is no automated or manual rollback process
to revert to an older version of JanusGraph once this feature is used.

### Version 1.0.1 (Release Date: ???)

/// tab | Maven
Expand Down
38 changes: 38 additions & 0 deletions docs/schema/index-management/index-performance.md
Original file line number Diff line number Diff line change
Expand Up @@ -323,6 +323,44 @@ index with label restriction is defined as unique, the uniqueness
constraint only applies to properties on vertices or edges for the
specified label.

### Inlining vertex properties into a Composite Index

Inlining vertex properties into a Composite Index structure can offer significant performance and efficiency benefits.

1. **Performance Improvements**
Faster Querying: Inlining vertex properties directly within the index allows the search engine to retrieve all relevant data from the index itself.
This means, queries don’t need to make additional calls to data stores to fetch full vertex information, significantly reducing lookup time.

2. **Data Locality**
In distributed storages, having inlined properties ensures that more complete data exists within individual partitions or shards.
This reduces cross-node network calls and improves the overall query performance by ensuring data is more local to the request being processed.

3. **Cost of Indexing vs. Storage Trade-off**
While inlining properties increases the size of the index (potentially leading to more extensive index storage requirements),
it is often a worthwhile trade-off for performance, mainly when query speed is critical.
This is a typical pattern in systems optimized for read-heavy workloads.

#### Usage
In order to take advantage of the inlined properties feature, JanusGraph Transaction should be set to use `.propertyPrefetching(false)`

Example:

```groovy
//Build index
mgmt.buildIndex("composite", Vertex.class)
.addKey(idKey)
.addInlinePropertyKey(nameKey)
.buildCompositeIndex()
mgmt.commit()
//Query
tx = graph.buildTransaction()
.propertyPrefetching(false) //this is important
.start()
tx.traversal().V().has("id", 100).next().value("name")
```

### Composite versus Mixed Indexes

1. Use a composite index for exact match index retrievals. Composite
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -70,6 +70,7 @@
import org.janusgraph.diskstorage.indexing.IndexInformation;
import org.janusgraph.diskstorage.indexing.IndexProvider;
import org.janusgraph.diskstorage.indexing.IndexTransaction;
import org.janusgraph.diskstorage.keycolumnvalue.scan.ScanJobFuture;
import org.janusgraph.diskstorage.log.kcvs.KCVSLog;
import org.janusgraph.diskstorage.util.time.TimestampProvider;
import org.janusgraph.example.GraphOfTheGodsFactory;
Expand All @@ -83,11 +84,14 @@
import org.janusgraph.graphdb.internal.ElementCategory;
import org.janusgraph.graphdb.internal.ElementLifeCycle;
import org.janusgraph.graphdb.internal.Order;
import org.janusgraph.graphdb.internal.RelationCategory;
import org.janusgraph.graphdb.log.StandardTransactionLogProcessor;
import org.janusgraph.graphdb.query.index.ApproximateIndexSelectionStrategy;
import org.janusgraph.graphdb.query.index.BruteForceIndexSelectionStrategy;
import org.janusgraph.graphdb.query.index.ThresholdBasedIndexSelectionStrategy;
import org.janusgraph.graphdb.query.profile.QueryProfiler;
import org.janusgraph.graphdb.query.vertex.BaseVertexCentricQuery;
import org.janusgraph.graphdb.query.vertex.VertexCentricQueryBuilder;
import org.janusgraph.graphdb.tinkerpop.optimize.step.JanusGraphMixedIndexAggStep;
import org.janusgraph.graphdb.tinkerpop.optimize.step.JanusGraphStep;
import org.janusgraph.graphdb.tinkerpop.optimize.strategy.JanusGraphMixedIndexCountStrategy;
Expand All @@ -106,6 +110,8 @@
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;

import java.lang.reflect.InvocationTargetException;
import java.lang.reflect.Method;
import java.time.Duration;
import java.time.Instant;
import java.time.temporal.ChronoUnit;
Expand Down Expand Up @@ -1445,6 +1451,199 @@ public void testCompositeVsMixedIndexing() {
assertTrue(tx.traversal().V().has("intId2", 234).hasNext());
}

@Test
public void testIndexInlineProperties() throws NoSuchMethodException {

clopen(option(FORCE_INDEX_USAGE), true);

final PropertyKey idKey = makeKey("id", Integer.class);
final PropertyKey nameKey = makeKey("name", String.class);
final PropertyKey cityKey = makeKey("city", String.class);

mgmt.buildIndex("composite", Vertex.class)
.addKey(idKey)
.addInlinePropertyKey(nameKey)
.buildCompositeIndex();

finishSchema();

String name = "Mizar";
String city = "Chicago";
tx.addVertex("id", 100, "name", name, "city", city);
tx.commit();

tx = graph.buildTransaction()
.propertyPrefetching(false) //this is important
.start();

Method m = VertexCentricQueryBuilder.class.getSuperclass().getDeclaredMethod("constructQuery", RelationCategory.class);
m.setAccessible(true);

CacheVertex v = (CacheVertex) (tx.traversal().V().has("id", 100).next());

verifyPropertyLoaded(v, "name", true, m);
verifyPropertyLoaded(v, "city", false, m);

assertEquals(name, v.value("name"));
assertEquals(city, v.value("city"));
}

@Test
public void testIndexInlinePropertiesReindex() throws NoSuchMethodException, InterruptedException {
clopen(option(FORCE_INDEX_USAGE), true);

PropertyKey idKey = makeKey("id", Integer.class);
PropertyKey nameKey = makeKey("name", String.class);
PropertyKey cityKey = makeKey("city", String.class);

mgmt.buildIndex("composite", Vertex.class)
.addKey(cityKey)
.buildCompositeIndex();

finishSchema();

String city = "Chicago";
for (int i = 0; i < 3; i++) {
tx.addVertex("id", i, "name", "name" + i, "city", city);
}

tx.commit();

tx = graph.buildTransaction()
.propertyPrefetching(false) //this is important
.start();

Method m = VertexCentricQueryBuilder.class.getSuperclass().getDeclaredMethod("constructQuery", RelationCategory.class);
m.setAccessible(true);

List<Vertex> vertices = tx.traversal().V().has("city", city).toList();
vertices.stream()
.map(v -> (CacheVertex) v)
.forEach(v -> verifyPropertyLoaded(v, "name", false, m));

tx.commit();

//Include inlined property
JanusGraphIndex index = mgmt.getGraphIndex("composite");
nameKey = mgmt.getPropertyKey("name");
mgmt.addInlinePropertyKey(index, nameKey);
finishSchema();

//Reindex
index = mgmt.getGraphIndex("composite");
ScanJobFuture scanJobFuture = mgmt.updateIndex(index, SchemaAction.REINDEX);
finishSchema();

while (!scanJobFuture.isDone()) {
Thread.sleep(1000);
}

//Try query now
tx = graph.buildTransaction()
.propertyPrefetching(false) //this is important
.start();

List<Vertex> vertices2 = tx.traversal().V().has("city", city).toList();
vertices2.stream()
.map(v -> (CacheVertex) v)
.forEach(v -> verifyPropertyLoaded(v, "name", true, m));

tx.commit();
}

@Test
public void testIndexInlinePropertiesUpdate() {

clopen(option(FORCE_INDEX_USAGE), true);

final PropertyKey idKey = makeKey("id", Integer.class);
final PropertyKey nameKey = makeKey("name", String.class);
final PropertyKey cityKey = makeKey("city", String.class);

mgmt.buildIndex("composite", Vertex.class)
.addKey(idKey)
.addInlinePropertyKey(nameKey)
.buildCompositeIndex();

finishSchema();

String name1 = "Mizar";
String name2 = "Alcor";

String city = "Chicago";
tx.addVertex("id", 100, "name", name1, "city", city);
tx.addVertex("id", 200, "name", name2, "city", city);
tx.commit();

tx = graph.buildTransaction()
.propertyPrefetching(false) //this is important
.start();

Vertex v = (tx.traversal().V().has("id", 100).next());
assertEquals(name1, v.value("name"));

//Update inlined property
v.property("name", "newName");
tx.commit();

tx = graph.buildTransaction()
.propertyPrefetching(false) //this is important
.start();

v = (tx.traversal().V().has("id", 100).next());
assertEquals("newName", v.value("name"));
}

@Test
public void testIndexInlinePropertiesLimit() throws NoSuchMethodException {

clopen(option(FORCE_INDEX_USAGE), true);

final PropertyKey nameKey = makeKey("name", String.class);
final PropertyKey cityKey = makeKey("city", String.class);

mgmt.buildIndex("composite", Vertex.class)
.addKey(cityKey)
.addInlinePropertyKey(nameKey)
.buildCompositeIndex();

finishSchema();

String city = "Chicago";
for (int i = 0; i < 10; i++) {
String name = "name_" + i;
tx.addVertex("name", name, "city", city);
}
tx.commit();

tx = graph.buildTransaction()
.propertyPrefetching(false) //this is important
.start();

Method m = VertexCentricQueryBuilder.class.getSuperclass().getDeclaredMethod("constructQuery", RelationCategory.class);
m.setAccessible(true);

List<Vertex> vertices = tx.traversal().V().has("city", city).limit(3).toList();
assertEquals(3, vertices.size());
vertices.stream().map(v -> (CacheVertex) v).forEach(v -> {
verifyPropertyLoaded(v, "name", true, m);
verifyPropertyLoaded(v, "city", false, m);
});
}

private void verifyPropertyLoaded(CacheVertex v, String propertyName, Boolean isPresent, Method m) {
VertexCentricQueryBuilder queryBuilder = v.query().direction(Direction.OUT);
//Verify the name property is already present in vertex cache
BaseVertexCentricQuery nameQuery = null;
try {
nameQuery = (BaseVertexCentricQuery) m.invoke(queryBuilder.keys(propertyName), RelationCategory.PROPERTY);
} catch (IllegalAccessException | InvocationTargetException e) {
throw new RuntimeException(e);
}
Boolean result = v.hasLoadedRelations(nameQuery.getSubQuery(0).getBackendQuery());
assertEquals(isPresent, result);
}

@Test
public void testCompositeAndMixedIndexing() {
final PropertyKey name = makeKey("name", String.class);
Expand Down
Loading

0 comments on commit 213b754

Please sign in to comment.