Skip to content

Commit

Permalink
Merge pull request #69 from simplyblock-io/document_storage-node
Browse files Browse the repository at this point in the history
added docs for storage-node
  • Loading branch information
schmidt-scaled authored Sep 11, 2024
2 parents a2cfb69 + b525439 commit bc3ede1
Show file tree
Hide file tree
Showing 4 changed files with 118 additions and 9 deletions.
30 changes: 28 additions & 2 deletions charts/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -99,11 +99,37 @@ The following table lists the configurable parameters of the latest Simplyblock
| `logicalVolume.qos_rw_mbytes` | the value of lvol parameter qos_rw_mbytes | `0` | |
| `logicalVolume.qos_r_mbytes` | the value of lvol parameter qos_r_mbytes | `0` | |
| `logicalVolume.qos_w_mbytes` | the value of lvol parameter qos_w_mbytes | `0` | |
| `logicalVolume.compression` | set to `True` if compression needs be enabled on lvols | `False` | |
| `logicalVolume.encryption` | set to `True` if encryption needs be enabled on lvols. | `False` | |
| `logicalVolume.distr_ndcs` | the value of distr_ndcs | `1` | |
| `logicalVolume.distr_npcs` | the value of distr_npcs | `1` | |
| `cachingnode.ifname` | the default interface to be used for binding the caching node to host interface | `eth0` | |
| `benchmarks` | the number of benchmarks to run | `0` | |
| `cachingnode.tolerations.create` | Whether to create tolerations for the caching node | `false` | |
| `cachingnode.tolerations.effect` | The effect of tolerations on the caching node | `NoSchedule` | |
| `cachingnode.tolerations.key ` | The key of tolerations for the caching node | `dedicated` | |
| `cachingnode.tolerations.operator ` | The operator for the caching node tolerations | `Equal` | |
| `cachingnode.tolerations.value ` | The value of tolerations for the caching node | `simplyblock-cache` | |
| `cachingnode.ifname` | the default interface to be used for binding the caching node to host interface | `eth0` | |
| `cachingnode.cpuMask` | the cpu mask for the spdk app to use for caching node | `<empty>` | |
| `cachingnode.spdkMem` | the amount of hugepage memory to allocate for caching node | `<empty>` | |
| `cachingnode.spdkImage` | SPDK image uri for caching node | `<empty>` | |
| `cachingnode.multipathing` | Enable multipathing for lvol connection | `true` | |
| `storagenode.tolerations.create` | Whether to create tolerations for the storage node | `false` | |
| `storagenode.tolerations.effect` | the effect of tolerations on the storage node | `NoSchedule` | |
| `storagenode.tolerations.key ` | the key of tolerations for the storage node | `dedicated` | |
| `storagenode.tolerations.operator ` | the operator for the storage node tolerations | `Equal` | |
| `storagenode.tolerations.value ` | the value of tolerations for the storage node | `simplyblock-cache` | |
| `storagenode.ifname` | the default interface to be used for binding the storage node to host interface | `eth0` | |
| `storagenode.cpuMask` | the cpu mask for the spdk app to use for storage node | `<empty>` | |
| `storagenode.spdkImage` | SPDK image uri for storage node | `<empty>` | |
| `storagenode.maxLvol` | the default max lvol per storage node | `10` | |
| `storagenode.maxSnap` | the default max snapshot per storage node | `10` | |
| `storagenode.maxProv` | the max provisioning size of all storage nodes | `150g` | |
| `storagenode.jmPercent` | the number in percent to use for JM from each device | `3` | |
| `storagenode.numPartitions` | the number of partitions to create per device | `0` | |
| `storagenode.numDevices` | the number of devices per storage node | `1` | |
| `storagenode.iobufSmallPoolCount` | bdev_set_options param | `<empty>` | |
| `storagenode.iobufLargePoolCount` | bdev_set_options param | `<empty>` | |


## troubleshooting
- Add `--wait -v=5 --debug` in `helm install` command to get detailed error
Expand Down
13 changes: 7 additions & 6 deletions docs/caching-nodes.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,21 +10,22 @@ Caching nodes are a special kind of node that works as a cache with a local NVMe

Make sure that the Kubernetes worker nodes to be used for cache has access to the simplyblock storage cluster. If you are using terraform to deploy the cluster. Please attach `container-instance-sg` security group to all the instances.

#### Step1: Install nvme cli tools
#### Step1: Install nvme cli tools and nbd

To attach NVMe device to the host machine, the CSI driver uses [nvme-cli]([url](https://github.com/linux-nvme/nvme-cli)). So lets install that
```
sudo yum install -y nvme-cli
sudo modprobe nvme-tcp
sudo modprobe nbd
```

#### Step1: Setup hugepages

Before you prepare the caching nodes, please decide the amount of huge pages that you would like to allocate for simplyblock and set those hugepages accordingly. We suggest allocating at least 8GB of huge pages.
Before you prepare the caching nodes, please decide the amount of huge pages that you would like to allocate for simplyblock and set those hugepages accordingly.
It is recommended to use a minimum of 1 GiB + 0.5% of the size of the local SSD, which you want to use as a cache. For example, if your local SSD has a size of 1.9 TiB, and you want to use it entirely as a write-through cache, you need to assign 10.5 GiB of RAM. If you only want to utilize 1 TiB (52.9% of the SSD), you assign 6 GiB of RAM and the cache will be automatically resized to fit the available (assigned) memory.

>[!IMPORTANT]
>The caching node requires at least 2.2% of the size of the nvme cache + 50 MiB of RAM. This should be the minimum configured as hugepage
>memory.
>One huge page contains 2 MiB of memory. A value of e.g. 4096 therefore is equal to 8 GiB of huge page memory.
```
sudo sysctl -w vm.nr_hugepages=4096
Expand Down Expand Up @@ -58,13 +59,13 @@ lspci

After the nodes are prepared, label the kubernetes nodes
```
kubectl label nodes ip-10-0-4-118.us-east-2.compute.internal ip-10-0-4-176.us-east-2.compute.internal type=cache
kubectl label nodes ip-10-0-4-118.us-east-2.compute.internal ip-10-0-4-176.us-east-2.compute.internal type=simplyblock-cache
```
Now the nodes are ready to deploy caching nodes.

### StorageClass

If the user wants to create a PVC that uses NVMe cache, a new storage class can be used with additional volume parameter as `type: simplyblock-cache`.
If the user wants to create a PVC that uses NVMe cache, a new storage class can be used with additional volume parameter as `type: cache`.


### Usage and Implementation
Expand Down
66 changes: 66 additions & 0 deletions docs/storage-nodes.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,66 @@
### Storage nodes volume provisioning

Apart from a disaggregated storage cluster deployment, storage-plane pods can now also be deployed onto k8s workers and they may-coexist with any compute workload (storage consumers).
Depending on the type of the storage node, it has to come with either at least one locally attached nvme drive or ebs block storage volumes are auto-attached in the during the deployment (aws only).

### Preparing nodes

#### Step 0: Networking & tools

Make sure that the Kubernetes worker nodes running storage-plane pods have nvme-oF access to each other and - if needed - external storage nodes in the simplyblock cluster. They also need connectivity to/from the simplyblock control plane. If you are using terraform to deploy the cluster. Please attach `container-instance-sg` security group to all the instances.

#### Step1: Install nvme cli tools and nbd

To attach NVMe device to the host machine, the CSI driver uses [nvme-cli]([url](https://github.com/linux-nvme/nvme-cli)). So lets install that
```
sudo yum install -y nvme-cli
sudo modprobe nvme-tcp
sudo modprobe nbd
```

#### Step1: Setup hugepages

Simplyblock uses huge page memory. It is necessary to reserve an amount of huge page memory early on.
The simplyblock storage plane pod allocates huge page memory from the reserved pool when the pod is added or restarted.
The amount reserved is based on parameters provided to the storage node add, such as the maximum amount of logical volumes and snapshots and the max. provisioning size of the node (see helm chart parameters).
The minimum amount to reserve is 2 GiB, but try to reserve at least 25% of the node's total RAM.
It is fine to reserve more than needed, as Simplyblock will allocate only the amount required from that pool and the rest can be used by the system.

>[!IMPORTANT]
>One huge page is 2 MiB. So e.g. a value of 4096 reserves 8 GiB of huge page memory.
```
sudo sysctl -w vm.nr_hugepages=4096
```

confirm the hugepage changes by running
cat /proc/meminfo | grep -i hug


and restart kubelet
```
sudo systemctl restart kubelet
```

conform if huge pages are added to the cluster or not.
```
kubectl describe node ip-10-0-2-184.us-east-2.compute.internal | grep hugepages-2Mi
```
this output should show 8GB. This worker node can allocate 8GB of hugepages to pods which is required in case of SPDK pods.

#### Step2: Mount the SSD or EBS to be used by the storage node
If the instance comes with a default NVMe disk, it can be used with minimum of 2 partitions and 2 device where one is used for Journal manager and the other storage node. Or 2 additional EBS one for Journal Manager and the other for the Storage. the disks can be viewed by running:

```
sudo yum install pciutils
lspci
```


#### Step3: Tag the kubernetes nodes

After the nodes are prepared, label the kubernetes nodes
```
kubectl label nodes ip-10-0-4-118.us-east-2.compute.internal ip-10-0-4-176.us-east-2.compute.internal type=simplyblock-storage-plane
```
Now the nodes are ready to deploy storage nodes.
18 changes: 17 additions & 1 deletion docs/support-ports.md
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
# Supported Port for eks or ks3
# Supported Port for eks or ks3 for caching-node

| Port | Protocol | Description
| -------------- | ------------- | -------------
Expand All @@ -8,3 +8,19 @@
| 2375 | TCP | Docker Engine API. Allows the management node to communicate with Docker engines running on other nodes.
| - | ICMP | Allows ICMP Echo requests. Used for ping operations to check the availability and responsiveness of management nodes.
| 5000 | TCP | Caching node. Enables communication with caching services running on the node.


# Supported Port for eks or ks3 for storage-node

| Port | Protocol | Description
| -------------- | ------------- | -------------
| 6443 | TCP | Kubernetes API server. Required for communication between the Kubernetes control plane and the nodes in the cluster.
| 22 | TCP | SSH access to the instances. Necessary for administrative access and management.
| 8080 | TCP | SPDK Proxy for the storage node. Facilitates communication between the storage nodes and the management node.
| 2375 | TCP | Docker Engine API. Allows the management node to communicate with Docker engines running on other nodes.
| - | ICMP | Allows ICMP Echo requests. Used for ping operations to check the availability and responsiveness of management nodes.
| 5000 | TCP | Storage node. Enables communication with storage-node services running on the node.
| 4420 | TCP | Storage node logical volume (lvol) connection - this port must be open (1) btw. all of the workers hosting storage plane pods (2) from all workers with pods connecting to storage to any workers hosting storage plane pods and any external storage nodes.
| 53 | UDP | DNS resolution from worker nodes. Necessary for resolving internal DNS queries within the cluster.
| 10250-10255 | TCP | Kubernetes node communication. Used for kubelet API communication between the nodes.
| 1025-65535 | UDP | Ephemeral ports for UDP traffic. Required for certain network protocols and services.

0 comments on commit bc3ede1

Please sign in to comment.