This microservice allows you to publish additions and modifications to resources to an LDES backend. The /publish
endpoint of this service can be added to a delta notifier configuration (https://github.com/mu-semtech/delta-notifier). This microservice has been evaluated to work with the https://github.com/lblod/ldes-producer
NPM library.
The following environment variables can be provided:
LDES_FOLDER
: the subfolder to store de LDES streams in.DATA_FOLDER
: the parent folder to store the LDES streams in (default to/data
).LDES_FRAGMENTER
(optional): the fragmenter which should be applied when adding new resources. For time-based streams, this will typically betime-fragmenter
.WRITE_INITIAL_STATE
: if set to "true", this writes the current state of the database to the ldes stream as one large page. Default: "false". Writes to/data/${streamname}/${firstfilebynumericalsort}.ttl
or creates file1.ttl
if no file exists. Streams are configured inconfig/initialization.ts
LDES_BASE
: base url to be used for the LDES stream that is published. Defaults tohttp://lmb.lblod.info/streams/ldes
, only used ifWRITE_INITIAL_STATE
is true.MAX_PAGE_SIZE_BYTES
: the maximum size of every initial LDES page file in bytes, defaults to10000000
. Only used ifWRITE_INITIAL_STATE
is true.INITIAL_STATE_LIMIT
: the limit to use when writing batches to the initial state file. Note: every batch will have its own prefixes, which means prefixes are redefined (allowed by the turtle spec and virtuoso). Default: 10000. Only used ifWRITE_INITIAL_STATE
is true.DIRECT_DB_ENDPOINT
: writing the initial state requires a direct connection to the database (we use ttl directly). This is the url of the database. Default: http://virtuoso:8890/sparql. Only used ifWRITE_INITIAL_STATE
is true.AUTO_HEALING
: whether or not to use the auto-healing functionality for the LDES stream, set to "true" to activate auto healing. defaults to false.CRON_HEALING
: the cron config for how often to trigger auto healing. Defaults to 0 * * * * (so every hour).HEALING_LIMIT
: number of instances to heal in one iteration of the auto healing. Defaults to 1000. Only used ifAUTO_HEALING
is true.HEALING_DUMP_GRAPH
: the (temporary) graph that is used to receive the raw triples posted on the LDES. Defaults tohttp://mu.semte.ch/graphs/ldes-dump
. This graph is cleared every time the healing process is run.HEALING_TRANSFORMED_GRAPH
: the (temporary) graph where the processed LDES data is stored. This holds the latest version of the LDES instances so they can be compared with what is currently in the database. Defaults tohttp://mu.semte.ch/graphs/transformed-ldes-data
HEALING_BATCH_SIZE
: the number of triples that are written to the dump graph at a time. Defaults to 100
The LDES delta pusher can fetch its own stream(s) and compare the final result with what is currently in the database. If it discovers changes, it will trigger a new dispatch of the affected instance to the stream.
The stream is read directly from the backend service (using the internal docker compose network) and stored into a temporary graph in the database. The default implementation only looks at the dct:modified time of the instances. The assumption here is that if the modified time is the same, then all other data will also be up to date on the stream. However, by adding other predicates to the healingPredicates
array in the config, you can have the stream also check for values of other predicates that are not on the LDES stream. Have a look at the example config in config/healing.ts
, it clarifies the meaning of each value