This is a dockerized demo setup containing:
- a Kafka setup
- mock data loader - loading R4 FHIR resources from mock-data-kdb.ndjson into the Kafka topic "fhir.post-gateway-kdb"
- GUI AkHQ on http://localhost:8082)
- a SPARK setup
- master on http://localhost:8083/
- a pathling container built from a Dockerfile (where the pathling python API is installed and important pyspark submit args are defined)
In order to start the containers with kafka and mock-data-loader + pathling container including jupyter lab, run the following command:
# if not executable, first run "chmod +x start.sh"
./start.sh
This script runs the kafka_stream_con.py script inside the container:
- starts the SparkSession
- reads the Kafka topic into Spark - prints out a key-value table with the R4 FHIR resources inside.
# if not executable, first run "chmod +x stop.sh"
./stop.sh
In order to use the jupyter lab, just run the following command and click on the URL to open Jupyter in a browser:
docker logs -f jupyter-pathling