Datawave Ingest Service is designed to layer on top of the existing ingest code to run ingest as a service in kubernetes. The service is built on Spring Boot to supply easy access to queues which will provide message chunks to be processed by ingest.
The datawave ingest microservices are broken down into these three systems
- Feeder service
- Ingest service
- Bundler service
These services will replace the existing flag maker, and ingest jobs.
Feeder service - monitors a directory for new files, file thresholds may be set to the number of files, size in bytes, or age of files. Files will be moved to a target directory and posted to a message topic. See README
Ingest Service - listens to a message queue, reads files from the location specified in the message and writes output to a directory along with a matching manifest file which references the original file. See README
Bundler Service - monitors a directory for sequence/manifest file pairs, based on file thresholds, size in bytes, or file age. Files are aggregated locally to a working directory, then pushed to a final target directory. All referenced manifest files are moved from their original location to a final destination by applying a find/replace based on configured properties. See README
- Docker (20.10.12+)
- Datawave (3.40.4)
Recommended to use an existing K8s environment. See README
git clone git@github.com:NationalSecurityAgency/datawave.git
cd datawave
git checkout 3.40.4
mvn clean install -DskipTests
mvn clean install -Pdocker
Feeder - See README
Ingest - See README
Bundler - See README
helm install ingest charts/
See RELASING