Skip to content

Commit

Permalink
Expose BerkeleyJE configs
Browse files Browse the repository at this point in the history
Related to #1623 and #4425

Signed-off-by: Oleksandr Porunov <alexandr.porunov@gmail.com>
  • Loading branch information
porunov committed Oct 27, 2024
1 parent 213b754 commit e849077
Show file tree
Hide file tree
Showing 9 changed files with 191 additions and 34 deletions.
11 changes: 10 additions & 1 deletion docs/changelog.md
Original file line number Diff line number Diff line change
Expand Up @@ -98,7 +98,7 @@ For more information on features and bug fixes in 1.1.0, see the GitHub mileston
* [JanusGraph zip](https://github.com/JanusGraph/janusgraph/releases/download/v1.1.0/janusgraph-1.1.0.zip)
* [JanusGraph zip with embedded Cassandra and ElasticSearch](https://github.com/JanusGraph/janusgraph/releases/download/v1.1.0/janusgraph-full-1.1.0.zip)

##### Upgrade Instructions
#### Upgrade Instructions

##### Inlining vertex properties into a Composite Index

Expand All @@ -115,6 +115,15 @@ See [documentation](./schema/index-management/index-performance.md#inlining-vert
It is critical that users carefully plan their migration to this new version, as there is no automated or manual rollback process
to revert to an older version of JanusGraph once this feature is used.

##### BerkeleyJE ability to overwrite arbitrary settings applied at `EnvironmentConfig` creation

The new namespace `storage.berkeleyje.ext` now allows to set custom configurations which were not directly exposed by
JanusGraph.
The full list of possible setting is available inside the Java class `com.sleepycat.je.EnvironmentConfig`.
All configurations values should be specified as `String` and be formated the same as specified in the official sleepycat
[documentation](https://docs.oracle.com/cd/E17277_02/html/java/com/sleepycat/je/EnvironmentConfig.html).
Example: `storage.berkeleyje.ext.je.lock.timeout=5000 ms`

### Version 1.0.1 (Release Date: ???)

/// tab | Maven
Expand Down
12 changes: 12 additions & 0 deletions docs/configs/janusgraph-cfg.md
Original file line number Diff line number Diff line change
Expand Up @@ -423,6 +423,18 @@ BerkeleyDB JE configuration options
| storage.berkeleyje.lock-mode | The BDB record lock mode used for read operations | String | LockMode.DEFAULT | MASKABLE |
| storage.berkeleyje.shared-cache | If true, the shared cache is used for all graph instances | Boolean | true | MASKABLE |

### storage.berkeleyje.ext
Overrides for arbitrary settings applied at `EnvironmentConfig` creation.
The full list of possible setting is available inside the Java class `com.sleepycat.je.EnvironmentConfig`. All configurations values should be specified as `String` and be formated the same as specified in the following [documentation](https://docs.oracle.com/cd/E17277_02/html/java/com/sleepycat/je/EnvironmentConfig.html).
Notice, for compatibility reasons, it's allowed to use `-` character instead of `.` for config keys. All dashes will be replaced by dots when passing those keys to `EnvironmentConfig`.


| Name | Description | Datatype | Default Value | Mutability |
| ---- | ---- | ---- | ---- | ---- |
| storage.berkeleyje.ext.je-lock-timeout | Lock timeout configuration. `0` disabled lock timeout completely. To set lock timeout via this configuration it's required to use String formated time representation. For example: `500 ms`, `5 min`, etc.
See information about value constraints in the official [sleepycat documentation](https://docs.oracle.com/cd/E17277_02/html/java/com/sleepycat/je/EnvironmentConfig.html#LOCK_TIMEOUT).
Notice, this option can be specified as `storage.berkeleyje.ext.je.lock.timeout` which will be treated the same as this configuration option. | String | (no default value) | MASKABLE |

### storage.cql
CQL storage backend options

Expand Down
46 changes: 46 additions & 0 deletions docs/storage-backend/bdb.md
Original file line number Diff line number Diff line change
Expand Up @@ -85,3 +85,49 @@ In order to not run out of memory, it is advised to disable transactions
transactions enabled requires BerkeleyDB to acquire read locks on the
data it is reading. When iterating over the entire graph, these read
locks can easily require more memory than is available.

## Additional BerkeleyDB JE configuration options

It's possible to set additional BerkeleyDB JE configuration which are not
directly exposed by JanusGraph by leveraging `storage.berkeleyje.ext`
namespace.

JanusGraph iterates over all properties prefixed with
`storage.berkeleyje.ext.`. It strips the prefix from each property key.
Any dash character (`-`) wil be replaced by dot character (`.`) in the
remainder of the stripped key. The final string will be interpreted as a parameter
ke for `com.sleepycat.je.EnvironmentConfig`.
Thus, both options `storage.berkeleyje.ext.je.lock.timeout` and
`storage.berkeleyje.ext.je-lock-timeout` will be treated
the same (as `storage.berkeleyje.ext.je.lock.timeout`).
The value associated with the key is not modified.
This allows embedding arbitrary settings in JanusGraph’s properties. Here’s an
example configuration fragment that customizes three BerkeleyDB settings
using the `storage.berkeleyje.ext.` config mechanism:

```properties
storage.backend=berkeleyje
storage.berkeleyje.ext.je.lock.timeout=5000 ms
storage.berkeleyje.ext.je.lock.deadlockDetect=false
storage.berkeleyje.ext.je.txn.timeout=5000 ms
storage.berkeleyje.ext.je.log.fileMax=100000000
```

## Deadlock troubleshooting

In concurrent environment deadlocks are possible when using BerkeleyDB JE storage
backend.
It may be complicated to deal with deadlocks in use-cases when multiple threads are
modifying same vertices (including edges creation between affected vertices).
More insights on this topic can be found in the GitHub issue
[#1623](https://github.com/JanusGraph/janusgraph/issues/1623).

Some users suggest the following configuration to deal with deadlocks:
```properties
storage.berkeleyje.isolation-level=READ_UNCOMMITTED
storage.berkeleyje.lock-mode=LockMode.READ_UNCOMMITTED
storage.berkeleyje.ext.je.lock.timeout=0
storage.lock.wait-time=5000
ids.authority.wait-time=2000
tx.max-commit-time=30000
```
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,6 @@

package org.janusgraph.diskstorage.berkeleyje;


import com.google.common.base.Preconditions;
import com.sleepycat.je.CacheMode;
import com.sleepycat.je.Database;
Expand Down Expand Up @@ -44,6 +43,7 @@
import org.janusgraph.graphdb.configuration.GraphDatabaseConfiguration;
import org.janusgraph.graphdb.configuration.PreInitializeConfigOptions;
import org.janusgraph.graphdb.transaction.TransactionConfiguration;
import org.janusgraph.util.system.ConfigurationUtil;
import org.janusgraph.util.system.IOUtils;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;
Expand Down Expand Up @@ -88,6 +88,27 @@ public class BerkeleyJEStoreManager extends LocalStoreManager implements Ordered
ConfigOption.Type.MASKABLE, String.class,
IsolationLevel.REPEATABLE_READ.toString(), disallowEmpty(String.class));

public static final ConfigNamespace BERKELEY_EXTRAS_NS =
new ConfigNamespace(BERKELEY_NS, "ext", "Overrides for arbitrary settings applied at `EnvironmentConfig` creation.\n" +
"The full list of possible setting is available inside the Java class `com.sleepycat.je.EnvironmentConfig`. " +
"All configurations values should be specified as `String` and be formated the same as specified in the following " +
"[documentation](https://docs.oracle.com/cd/E17277_02/html/java/com/sleepycat/je/EnvironmentConfig.html).\n" +
"Notice, for compatibility reasons, it's allowed to use `-` character instead of `.` for config keys. All dashes will " +
"be replaced by dots when passing those keys to `EnvironmentConfig`.");

// This setting isn't used directly in Java, but this setting will be picked up indirectly during parsing of the
// subset configuration of `BERKELEY_EXTRAS_NS` namespace
public static final ConfigOption<String> EXT_LOCK_TIMEOUT =
new ConfigOption<>(BERKELEY_EXTRAS_NS, toJanusGraphConfigKey(EnvironmentConfig.LOCK_TIMEOUT),
String.format("Lock timeout configuration. `0` disabled lock timeout completely. " +
"To set lock timeout via this configuration it's required to use " +
"String formated time representation. For example: `500 ms`, `5 min`, etc. \nSee information about value " +
"constraints in the official " +
"[sleepycat documentation](https://docs.oracle.com/cd/E17277_02/html/java/com/sleepycat/je/EnvironmentConfig.html#LOCK_TIMEOUT).\n" +
"Notice, this option can be specified as `%s` which will be treated the same as this configuration option.",
BERKELEY_EXTRAS_NS.toStringWithoutRoot() + "." + EnvironmentConfig.LOCK_TIMEOUT
), ConfigOption.Type.MASKABLE, String.class);

private final Map<String, BerkeleyJEKeyValueStore> stores;

protected Environment environment;
Expand Down Expand Up @@ -132,12 +153,32 @@ private void initialize(int cachePercent, final boolean sharedCache, final Cache
}

//Open the environment
Map<String, String> extraSettings = getSettingsFromJanusGraphConf(storageConfig);
extraSettings.forEach((key, value) -> envConfig.setConfigParam(toBerkeleyConfigKey(key), value));

// Open the environment
environment = new Environment(directory, envConfig);

} catch (DatabaseException e) {
throw new PermanentBackendException("Error during BerkeleyJE initialization: ", e);
}
}

public static String toBerkeleyConfigKey(String janusGraphConfigKey){
return janusGraphConfigKey.replace("-", ".");
}

public static String toJanusGraphConfigKey(String berkeleyConfigKey){
return berkeleyConfigKey.replace(".", "-");
}

static Map<String, String> getSettingsFromJanusGraphConf(Configuration config) {
final Map<String, String> settings = ConfigurationUtil.getSettingsFromJanusGraphConf(config, BERKELEY_EXTRAS_NS);
if(log.isDebugEnabled()){
settings.forEach((key, val) -> log.debug("[BERKELEY ext.* cfg] Set {}: {}", key, val));
log.debug("Loaded {} settings from the {} JanusGraph config namespace", settings.size(), BERKELEY_EXTRAS_NS);
}
return settings;
}

@Override
Expand Down Expand Up @@ -335,4 +376,8 @@ private TransactionBegin(String msg) {
super(msg);
}
}

public Environment getEnvironment(){
return environment;
}
}
Original file line number Diff line number Diff line change
Expand Up @@ -15,6 +15,8 @@
package org.janusgraph.graphdb.berkeleyje;

import com.google.common.base.Preconditions;
import com.sleepycat.je.Environment;
import com.sleepycat.je.EnvironmentConfig;
import com.sleepycat.je.LockMode;
import org.janusgraph.BerkeleyStorageSetup;
import org.janusgraph.core.JanusGraphException;
Expand All @@ -27,6 +29,7 @@
import org.janusgraph.diskstorage.configuration.ConfigOption;
import org.janusgraph.diskstorage.configuration.ModifiableConfiguration;
import org.janusgraph.diskstorage.configuration.WriteConfiguration;
import org.janusgraph.diskstorage.keycolumnvalue.keyvalue.OrderedKeyValueStoreManagerAdapter;
import org.janusgraph.graphdb.JanusGraphTest;
import org.janusgraph.graphdb.configuration.GraphDatabaseConfiguration;
import org.junit.jupiter.api.Disabled;
Expand All @@ -36,6 +39,7 @@

import java.time.Duration;
import java.time.temporal.ChronoUnit;
import java.util.concurrent.TimeUnit;

import static org.janusgraph.graphdb.configuration.GraphDatabaseConfiguration.ALLOW_SETTING_VERTEX_ID;
import static org.janusgraph.graphdb.configuration.GraphDatabaseConfiguration.ALLOW_CUSTOM_VERTEX_ID_TYPES;
Expand All @@ -49,6 +53,12 @@ public class BerkeleyGraphTest extends JanusGraphTest {
private static final Logger log =
LoggerFactory.getLogger(BerkeleyGraphTest.class);

public EnvironmentConfig getCurrentEnvironmentConfig() {
BerkeleyJEStoreManager storeManager = (BerkeleyJEStoreManager) ((OrderedKeyValueStoreManagerAdapter) graph.getBackend().getStoreManager()).getManager();
Environment environment = storeManager.getEnvironment();
return environment.getConfig();
}

@Override
public WriteConfiguration getConfiguration() {
ModifiableConfiguration modifiableConfiguration = BerkeleyStorageSetup.getBerkeleyJEConfiguration();
Expand Down Expand Up @@ -162,4 +172,24 @@ public void testCannotUseCustomStringId() {
() -> clopen(option(ALLOW_SETTING_VERTEX_ID), true, option(ALLOW_CUSTOM_VERTEX_ID_TYPES), true));
assertEquals("allow-custom-vid-types is not supported for OrderedKeyValueStore", ex.getMessage());
}

@Test
public void testExposedConfigurations() throws BackendException {
clopen(option(BerkeleyJEStoreManager.EXT_LOCK_TIMEOUT), "4321 ms");
assertEquals(4321, getCurrentEnvironmentConfig().getLockTimeout(TimeUnit.MILLISECONDS));
close();
WriteConfiguration configuration = getConfiguration();
clearGraph(configuration);
configuration.set(BerkeleyJEStoreManager.BERKELEY_EXTRAS_NS.toStringWithoutRoot()+"."+EnvironmentConfig.LOCK_TIMEOUT, "12345 ms");
open(configuration);
assertEquals(12345, getCurrentEnvironmentConfig().getLockTimeout(TimeUnit.MILLISECONDS));
close();
clearGraph(configuration);
configuration.set(BerkeleyJEStoreManager.BERKELEY_EXTRAS_NS.toStringWithoutRoot()+"."+EnvironmentConfig.ENV_IS_TRANSACTIONAL, "true");
open(configuration);
assertTrue(getCurrentEnvironmentConfig().getTransactional());
close();
clearGraph(configuration);
}

}
Original file line number Diff line number Diff line change
Expand Up @@ -275,5 +275,8 @@ public static Predicate<Long> positiveLong() {
return num -> num!=null && num>0;
}

public static Predicate<Long> nonnegativeLong() {
return num -> num!=null && num>=0;
}

}
Original file line number Diff line number Diff line change
Expand Up @@ -155,4 +155,8 @@ public List<KeyRange> getLocalKeyPartition() throws BackendException {
public String getName() {
return manager.getName();
}

public OrderedKeyValueStoreManager getManager() {
return manager;
}
}
Original file line number Diff line number Diff line change
Expand Up @@ -14,6 +14,7 @@

package org.janusgraph.util.system;

import com.google.common.base.Joiner;
import com.google.common.base.Preconditions;
import org.apache.commons.configuration2.BaseConfiguration;
import org.apache.commons.configuration2.Configuration;
Expand All @@ -24,11 +25,14 @@
import org.apache.commons.configuration2.builder.fluent.PropertiesBuilderParameters;
import org.apache.commons.configuration2.convert.DefaultListDelimiterHandler;
import org.apache.commons.configuration2.ex.ConfigurationException;
import org.janusgraph.diskstorage.configuration.ConfigNamespace;

import java.io.File;
import java.lang.reflect.Array;
import java.lang.reflect.Constructor;
import java.lang.reflect.InvocationTargetException;
import java.util.ArrayList;
import java.util.HashMap;
import java.util.HashSet;
import java.util.Iterator;
import java.util.List;
Expand Down Expand Up @@ -169,4 +173,34 @@ private static PropertiesConfiguration loadPropertiesConfig(PropertiesBuilderPar
}
return builder.configure(newParams).getConfiguration();
}

public static Map<String, String> getSettingsFromJanusGraphConf(org.janusgraph.diskstorage.configuration.Configuration config, ConfigNamespace namespace) {

final Map<String, String> settings = new HashMap<>();

final Map<String,Object> configSub = config.getSubset(namespace);
for (Map.Entry<String,Object> entry : configSub.entrySet()) {
String key = entry.getKey();
Object val = entry.getValue();
if (null == val) continue;
if (List.class.isAssignableFrom(val.getClass())) {
// Pretty print lists using comma-separated values and no surrounding square braces for ES
List l = (List) val;
settings.put(key, Joiner.on(",").join(l));
} else if (val.getClass().isArray()) {
// As with Lists, but now for arrays
// The Object copy[] business lets us avoid repetitive primitive array type checking and casting
Object[] copy = new Object[Array.getLength(val)];
for (int i= 0; i < copy.length; i++) {
copy[i] = Array.get(val, i);
}
settings.put(key, Joiner.on(",").join(copy));
} else {
// Copy anything else unmodified
settings.put(key, val.toString());
}
}

return settings;
}
}
Original file line number Diff line number Diff line change
Expand Up @@ -14,17 +14,15 @@

package org.janusgraph.diskstorage.es;

import com.google.common.base.Joiner;
import com.google.common.base.Preconditions;
import org.janusgraph.diskstorage.configuration.Configuration;
import org.janusgraph.diskstorage.es.rest.RestClientSetup;
import org.janusgraph.util.system.ConfigurationUtil;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;

import java.io.IOException;
import java.lang.reflect.Array;
import java.util.HashMap;
import java.util.List;
import java.util.Map;

/**
Expand Down Expand Up @@ -53,36 +51,12 @@ public Connection connect(Configuration config) throws IOException {
};

static Map<String, Object> getSettingsFromJanusGraphConf(Configuration config) {

final Map<String, Object> settings = new HashMap<>();

int keysLoaded = 0;
final Map<String,Object> configSub = config.getSubset(ElasticSearchIndex.ES_CREATE_EXTRAS_NS);
for (Map.Entry<String,Object> entry : configSub.entrySet()) {
String key = entry.getKey();
Object val = entry.getValue();
if (null == val) continue;
if (List.class.isAssignableFrom(val.getClass())) {
// Pretty print lists using comma-separated values and no surrounding square braces for ES
List l = (List) val;
settings.put(key, Joiner.on(",").join(l));
} else if (val.getClass().isArray()) {
// As with Lists, but now for arrays
// The Object copy[] business lets us avoid repetitive primitive array type checking and casting
Object[] copy = new Object[Array.getLength(val)];
for (int i= 0; i < copy.length; i++) {
copy[i] = Array.get(val, i);
}
settings.put(key, Joiner.on(",").join(copy));
} else {
// Copy anything else unmodified
settings.put(key, val.toString());
}
log.debug("[ES ext.* cfg] Set {}: {}", key, val);
keysLoaded++;
final Map<String, String> settings = ConfigurationUtil.getSettingsFromJanusGraphConf(config, ElasticSearchIndex.ES_CREATE_EXTRAS_NS);
if(log.isDebugEnabled()){
settings.forEach((key, val) -> log.debug("[ES ext.* cfg] Set {}: {}", key, val));
log.debug("Loaded {} settings from the {} JanusGraph config namespace", settings.size(), ElasticSearchIndex.ES_CREATE_EXTRAS_NS);
}
log.debug("Loaded {} settings from the {} JanusGraph config namespace", keysLoaded, ElasticSearchIndex.ES_CREATE_EXTRAS_NS);
return settings;
return new HashMap<>(settings);
}

private static final Logger log = LoggerFactory.getLogger(ElasticSearchSetup.class);
Expand Down

1 comment on commit e849077

@github-actions
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Benchmark

Benchmark suite Current: e849077 Previous: 213b754 Ratio
org.janusgraph.JanusGraphSpeedBenchmark.basicAddAndDelete 12695.817297678543 ms/op 12994.438964091325 ms/op 0.98
org.janusgraph.GraphCentricQueryBenchmark.getVertices 969.7370391849593 ms/op 957.3251909284766 ms/op 1.01
org.janusgraph.MgmtOlapJobBenchmark.runClearIndex 215.41615324166668 ms/op 216.45303196086957 ms/op 1.00
org.janusgraph.MgmtOlapJobBenchmark.runReindex 340.33103963380955 ms/op 342.81005004892853 ms/op 0.99
org.janusgraph.JanusGraphSpeedBenchmark.basicCount 195.97803145353998 ms/op 207.33680618088454 ms/op 0.95
org.janusgraph.CQLMultiQueryMultiSlicesBenchmark.getValuesAllPropertiesWithAllMultiQuerySlicesUnderMaxRequestsPerConnection 5094.166169623522 ms/op 4953.295327365606 ms/op 1.03
org.janusgraph.CQLMultiQueryBenchmark.getElementsWithUsingEmitRepeatSteps 16545.594969109246 ms/op 16917.057558105356 ms/op 0.98
org.janusgraph.CQLMultiQueryMultiSlicesBenchmark.getValuesMultiplePropertiesWithSmallBatch 19282.200540264646 ms/op 18983.13907385985 ms/op 1.02
org.janusgraph.CQLMultiQueryMultiSlicesBenchmark.vertexCentricPropertiesFetching 59327.58756403333 ms/op 56527.85002600001 ms/op 1.05
org.janusgraph.CQLMultiQueryDropBenchmark.dropVertices 1669.4958923473687 ms/op 1570.8428983417461 ms/op 1.06
org.janusgraph.CQLMultiQueryBenchmark.getAllElementsTraversedFromOuterVertex 8381.442515477169 ms/op 8433.13502817794 ms/op 0.99
org.janusgraph.CQLMultiQueryBenchmark.getVerticesWithDoubleUnion 393.9160047990152 ms/op 384.2152506805113 ms/op 1.03
org.janusgraph.CQLMultiQueryMultiSlicesBenchmark.getValuesAllPropertiesWithUnlimitedBatch 4641.871545085265 ms/op 4227.1771161974975 ms/op 1.10
org.janusgraph.CQLMultiQueryBenchmark.getNames 8207.434059236663 ms/op 8339.221853925019 ms/op 0.98
org.janusgraph.CQLMultiQueryMultiSlicesBenchmark.getValuesThreePropertiesWithAllMultiQuerySlicesUnderMaxRequestsPerConnection 5751.128616707412 ms/op 5604.356576582386 ms/op 1.03
org.janusgraph.CQLMultiQueryBenchmark.getLabels 6848.481548023952 ms/op 7082.884761983721 ms/op 0.97
org.janusgraph.CQLMultiQueryBenchmark.getVerticesFilteredByAndStep 421.84189468630683 ms/op 430.31039337061094 ms/op 0.98
org.janusgraph.CQLMultiQueryBenchmark.getVerticesFromMultiNestedRepeatStepStartingFromSingleVertex 13006.633710193382 ms/op 12459.636105572155 ms/op 1.04
org.janusgraph.CQLMultiQueryBenchmark.getVerticesWithCoalesceUsage 358.3279515953258 ms/op 357.5981502840734 ms/op 1.00
org.janusgraph.CQLMultiQueryMultiSlicesBenchmark.getValuesMultiplePropertiesWithAllMultiQuerySlicesUnderMaxRequestsPerConnection 14227.201555317857 ms/op 14793.559446997619 ms/op 0.96
org.janusgraph.CQLMultiQueryBenchmark.getIdToOutVerticesProjection 245.21740515790424 ms/op 245.84974412075837 ms/op 1.00
org.janusgraph.CQLMultiQueryMultiSlicesBenchmark.getValuesMultiplePropertiesWithUnlimitedBatch 14519.640140464173 ms/op 13806.414282860256 ms/op 1.05
org.janusgraph.CQLCompositeIndexInlinePropBenchmark.searchVertices 1502.6868476209847 ms/op 1511.142514571489 ms/op 0.99
org.janusgraph.CQLMultiQueryBenchmark.getNeighborNames 8066.546144612916 ms/op 8411.967305495045 ms/op 0.96
org.janusgraph.CQLMultiQueryBenchmark.getElementsWithUsingRepeatUntilSteps 8819.547094282061 ms/op 9104.974810254043 ms/op 0.97
org.janusgraph.CQLMultiQueryBenchmark.getAdjacentVerticesLocalCounts 8583.240885545258 ms/op 8793.398072298722 ms/op 0.98

This comment was automatically generated by workflow using github-action-benchmark.

Please sign in to comment.