Skip to content

Latest commit

 

History

History
397 lines (266 loc) · 9.43 KB

README.md

File metadata and controls

397 lines (266 loc) · 9.43 KB

Rucio Kubernetes Tutorial

Preliminaries

  • Clone this repo to your local machine
git clone https://github.com/rucio/k8s-tutorial/

NOTE: All following commands should be run from the top-level directory of this repository.

Set up a Kubernetes cluster

You can skip this step if you have already set up a Kubernetes cluster.

  • Run the minikube setup script:
./scripts/setup-minikube.sh

Deploy Rucio, FTS and storage

You can perform either an automatic deployment or a manual deployment, as documented below.

Automatic deployment

  • Run the Rucio deployment script:
./scripts/deploy-rucio.sh

Manual deployment

Add repositories to Helm

helm repo add stable https://charts.helm.sh/stable
helm repo add bitnami https://charts.bitnami.com/bitnami
helm repo add rucio https://rucio.github.io/helm-charts

Apply secrets

kubectl apply -k ./secrets

(Optional) Delete existing Postgres volume claim

If you have done this step in a previous tutorial deployment on this cluster, the existing Postgres PersistentVolumeClaim must be deleted.

  1. Verify if the PVC exists via:
kubectl get pvc data-postgres-postgresql-0

If the PVC exists, the command will return the following message:

NAME                         STATUS   VOLUME                                     CAPACITY   ACCESS MODES   STORAGECLASS   VOLUMEATTRIBUTESCLASS   AGE
data-postgres-postgresql-0   Bound    ...   8Gi        RWO            standard       <unset>                 4s

If the PVC does not exist, the command will return this message:

Error from server (NotFound): persistentvolumeclaims "data-postgres-postgresql-0" not found

You can skip to the next section if the PVC does not exist.

  1. If the PVC exists, patch it to allow deletion:
kubectl patch pvc data-postgres-postgresql-0 -p '{"metadata":{"finalizers":null}}'
  1. Delete the PVC:
kubectl delete pvc data-postgres-postgresql-0
  1. You might also need to uninstall postgres if it is installed:
helm uninstall postgres

Install Postgres

helm install postgres bitnami/postgresql -f manifests/values-postgres.yaml

Verify that Postgres is running

kubectl get pod postgres-postgresql-0

Once the Postgres setup is complete, you should see STATUS: Running.

Start init container pod

  • Once Postgres is running, start the init container pod to set up the Rucio database:
kubectl apply -f manifests/init-pod.yaml
  • This command will take some time to complete. You can follow the relevant logs via:
kubectl logs -f init

Verify that the init container pod setup is complete

kubectl get pod init

Once the init container pod setup is complete, you should see STATUS: Completed.

Deploy the Rucio server

helm install server rucio/rucio-server -f manifests/values-server.yaml
  • You can check the deployment status via:
kubectl rollout status deployment server-rucio-server

Start the XRootD (XRD) storage container pods

  • This command will deploy three XRD storage container pods.
kubectl apply -f manifests/xrd.yaml

Deploy the FTS database (MySQL)

kubectl apply -f manifests/ftsdb.yaml
  • You can check the deployment status via:
kubectl rollout status deployment fts-mysql

Deploy the FTS server

  • Once the FTS database deployment is complete, Install the FTS server:
kubectl apply -f manifests/fts.yaml
  • You can check the deployment status via:
kubectl rollout status deployment fts-server

Deploy the Rucio daemons

helm install daemons rucio/rucio-daemons -f manifests/values-daemons.yaml

This command might take a few minutes.

Troubleshooting

  • If at any point helm fails to install, before re-installing, remove the previous failed installation:
helm list # list all helm installations
helm delete $installation
  • You might also get errors that a job also exists. You can easily remove this:
kubectl get jobs # get all jobs
kubectl delete jobs/$jobname

Use Rucio

Once the setup is complete, you can use Rucio by interacting with it via a client.

You can either run the provided script to showcase the usage of Rucio, or you can manually run the Rucio commands described in the Manual client usage section.

Client usage showcase script

  • Run the Rucio usage script:
./scripts/use-rucio.sh

Manual client usage

Start client container pod for interactive use

kubectl apply -f manifests/client.yaml
  • You can verify that the client container is running via:
kubectl get pod client

Once the client container pod setup is complete, you should see STATUS: Running.

Enter interactive shell in the client container

kubectl exec -it client -- /bin/bash

Create the Rucio Storage Elements (RSEs)

rucio-admin rse add XRD1
rucio-admin rse add XRD2
rucio-admin rse add XRD3

Add the protocol definitions for the storage servers

rucio-admin rse add-protocol --hostname xrd1 --scheme root --prefix //rucio --port 1094 --impl rucio.rse.protocols.gfal.Default --domain-json '{"wan": {"read": 1, "write": 1, "delete": 1, "third_party_copy_read": 1, "third_party_copy_write": 1}, "lan": {"read": 1, "write": 1, "delete": 1}}' XRD1
rucio-admin rse add-protocol --hostname xrd2 --scheme root --prefix //rucio --port 1094 --impl rucio.rse.protocols.gfal.Default --domain-json '{"wan": {"read": 1, "write": 1, "delete": 1, "third_party_copy_read": 1, "third_party_copy_write": 1}, "lan": {"read": 1, "write": 1, "delete": 1}}' XRD2
rucio-admin rse add-protocol --hostname xrd3 --scheme root --prefix //rucio --port 1094 --impl rucio.rse.protocols.gfal.Default --domain-json '{"wan": {"read": 1, "write": 1, "delete": 1, "third_party_copy_read": 1, "third_party_copy_write": 1}, "lan": {"read": 1, "write": 1, "delete": 1}}' XRD3

Enable FTS

rucio-admin rse set-attribute --rse XRD1 --key fts --value https://fts:8446
rucio-admin rse set-attribute --rse XRD2 --key fts --value https://fts:8446
rucio-admin rse set-attribute --rse XRD3 --key fts --value https://fts:8446

Note that 8446 is the port exposed by the fts-server pod. You can view the ports opened by a pod by kubectl describe pod PODNAME.

Fake a full mesh network

rucio-admin rse add-distance --distance 1 --ranking 1 XRD1 XRD2
rucio-admin rse add-distance --distance 1 --ranking 1 XRD1 XRD3
rucio-admin rse add-distance --distance 1 --ranking 1 XRD2 XRD1
rucio-admin rse add-distance --distance 1 --ranking 1 XRD2 XRD3
rucio-admin rse add-distance --distance 1 --ranking 1 XRD3 XRD1
rucio-admin rse add-distance --distance 1 --ranking 1 XRD3 XRD2

Indefinite storage quota for root

rucio-admin account set-limits root XRD1 -1
rucio-admin account set-limits root XRD2 -1
rucio-admin account set-limits root XRD3 -1

Create a default scope for testing

rucio-admin scope add --account root --scope test

Create initial transfer testing data

dd if=/dev/urandom of=file1 bs=10M count=1
dd if=/dev/urandom of=file2 bs=10M count=1
dd if=/dev/urandom of=file3 bs=10M count=1
dd if=/dev/urandom of=file4 bs=10M count=1

Upload the files

rucio upload --rse XRD1 --scope test file1
rucio upload --rse XRD1 --scope test file2
rucio upload --rse XRD2 --scope test file3
rucio upload --rse XRD2 --scope test file4

Create a few datasets and containers

rucio add-dataset test:dataset1
rucio attach test:dataset1 test:file1 test:file2

rucio add-dataset test:dataset2
rucio attach test:dataset2 test:file3 test:file4

rucio add-container test:container
rucio attach test:container test:dataset1 test:dataset2

rucio add-dataset test:dataset3
rucio attach test:dataset3 test:file4

Create a rule

rucio add-rule test:container 1 XRD3

This command will output a rule ID, which can also be obtained via:

rucio list-rules test:container

Check rule info

  • You can check the information of the rule that has been created:
rucio rule-info <rule_id>

As the daemons run with long sleep cycles (e.g. 30 seconds, 60 seconds) by default, this could take a while. You can monitor the output of the daemon containers to see what they are doing.

Some helpful commands

  • Activate kubectl completion:

Bash:

source <(kubectl completion bash)

Zsh:

source <(kubectl completion zsh)
  • View all containers:
kubectl get pods 
kubectl get pods --all-namespaces
  • View logfiles of a pod:
kubectl logs <NAME>
  • Tail logfiles of a pod:
kubectl logs -f <NAME>
  • Update helm repositories:
helm repo update
  • Shut down minikube:
minikube stop
  • Command references:
  1. kubectl : https://kubernetes.io/docs/reference/kubectl/cheatsheet/
  2. helm : https://helm.sh/docs/helm/
  3. minikube : https://cheatsheet.dennyzhang.com/cheatsheet-minikube-a4