CKE deploys and maintains a Kubernetes cluster and an etcd cluster solely for the Kubernetes cluster. The configurations of the clusters can be defined by a YAML or JSON object with these fields:
Name | Required | Type | Description |
---|---|---|---|
name |
true | string | The k8s cluster name. |
nodes |
true | array | Node list. |
taint_control_plane |
false | bool | If true, taint control plane nodes. |
control_plane_tolerations |
false | array | List of tolerated taint keys for control plane. |
service_subnet |
true | string | CIDR subnet for k8s Service . |
dns_servers |
false | array | List of upstream DNS server IP addresses. |
dns_service |
false | string | Upstream DNS service name with namespace as namespace/service . |
reboot |
false | Reboot |
See Reboot. |
repair |
false | Repair |
See Repair. |
options |
false | Options |
See Options. |
control_plane_tolerations
is used in sabakan integration.- Upstream DNS servers can be specified one of the following ways:
- List server IP addresses in
dns_servers
. - Specify Kubernetes
Service
name indns_service
(e.g."kube-system/dns"
).
The service type must beClusterIP
.
- List server IP addresses in
A Node
has these fields:
Name | Required | Type | Description |
---|---|---|---|
address |
true | string | IP address of the node. |
hostname |
false | string | Override the real hostname of the node in k8s. |
user |
true | string | SSH user name. |
control_plane |
false | bool | If true, the node will be used for k8s control plane and etcd. |
annotations |
false | object | Node annotations. |
labels |
false | object | Node labels. |
taints |
false | []Taint |
Node taints. |
annotations
, labels
, and taints
are added or updated, but not removed.
This is because other applications may edit their own annotations, labels, or taints.
Note that annotations, labels, and taints whose names contain cke.cybozu.com/
or start with node-role.kubernetes.io/
are reserved for CKE internal usage, therefore should not be used.
Name | Required | Type | Description |
---|---|---|---|
key |
true | string | The taint key to be applied to a node. |
value |
false | string | The taint value corresponding to the taint key. |
effect |
true | string | NoSchedule , PreferNoSchedule or NoExecute . |
Name | Required | Type | Description |
---|---|---|---|
reboot_command |
true | array | A command to reboot. List of strings. |
boot_check_command |
true | array | A command to check nodes booted. List of strings. |
eviction_timeout_seconds |
false | *int | Deadline for eviction. Must be positive. Default: 600 (10 minutes). |
command_timeout_seconds |
false | *int | Deadline for rebooting. Zero means infinity. Default: wait indefinitely |
command_retries |
false | *int | Number of reboot retries, not including initial attempt. Default: 0 |
command_interval |
false | *int | Interval of time between reboot retries in seconds. Default: 0 |
evict_retries |
false | *int | Number of eviction retries, not including initial attempt. Default: 0 |
evict_interval |
false | *int | Interval of time between eviction retries in seconds. Default: 0 |
max_concurrent_reboots |
false | *int | Maximum number of nodes to be rebooted concurrently. Default: 1 |
protected_namespaces |
false | LabelSelector |
A label selector to protect namespaces. |
reboot_command
is the command to reboot a node. The node is passed as a command argument.
The command should return zero if the reboot is successfully started.
If command_timeout_seconds
is specified, the reboot command should return within command_timeout_seconds
seconds, or it is considered failed.
If the reboot command has failed, CKE retries it for command_retries
times with command_interval
-second interval.
boot_check_command
is the command to check a node booted. The node and the unix time when the reboot command is run are passed as command arguments.
If the node is successfully booted, this command should output true
to stdout and the exit status should be zero.
If the node is not booted yet, this command should output false
to stdout and the exit status should be zero.
If command_timeout_seconds
is specified, the check command should return within command_timeout_seconds
seconds, or it is considered failed.
CKE tries to delete Pods in the protected_namespaces
gracefully with the Kubernetes eviction API.
If the eviction API has failed, CKE retries it for evict_retries
times with evict_interval
-second interval.
If any of the Pods cannot be deleted, it aborts the operation.
The Pods in the non-protected namespaces are also tried to be deleted gracefully with the Kubernetes eviction API, but they would be simply deleted if eviction is denied.
If protected_namespaces
is not given, all namespaces are protected.
Name | Required | Type | Description |
---|---|---|---|
repair_procedures |
true | []RepairProcedure |
List of repair procedures. |
max_concurrent_repairs |
false | *int | Maximum number of machines to be repaired concurrently. Default: 1 |
protected_namespaces |
false | LabelSelector |
A label selector to protect namespaces. |
evict_retries |
false | *int | Number of eviction retries, not including initial attempt. Default: 0 |
evict_interval |
false | *int | Number of time between eviction retries in seconds. Default: 0 |
eviction_timeout_seconds |
false | *int | Deadline for eviction. Must be positive. Default: 600 (10 minutes) |
The repair configurations control the repair functionality.
Name | Required | Type | Description |
---|---|---|---|
machine_types |
true | array | Type names of the target machines to be repaired by this procedure. List of strings. |
repair_operations |
true | []RepairOperation |
List of repair operations. |
Name | Required | Type | Description |
---|---|---|---|
operation |
true | string | Name of repair operation. |
repair_steps |
true | []RepairStep |
Sequences of repair steps. |
health_check_command |
true | array | A command to check repaired machine's health. List of strings. |
command_timeout_seconds |
false | *int | Deadline for health retrieval. Zero means infinity. Default: 30 |
success_command |
false | array | A command executed when repair succeeded. List of strings. |
success_command_timeout |
false | *int | Deadline for execution of succcess_command. Zero means infinity. Default: 30 |
Name | Required | Type | Description |
---|---|---|---|
repair_command |
true | array | A command and its arguments to repair the target machine. List of strings. |
command_timeout_seconds |
false | *int | Deadline for repairing. Zero means infinity. Default: 30 |
command_retries |
false | *int | Number of repair retries, not including initial attempt. Default: 0 |
command_interval |
false | *int | Interval of time between repair retries in seconds. Default: 0 |
need_drain |
false | bool | If true, perform drain of Pods on the target machine prior to the execution of the repair command. Default: false |
watch_seconds |
false | *int | Follow-up duration in seconds to watch whether the machine becomes healthy after the execution of the repair command. Default: 0 |
Option
is a set of optional parameters for k8s components.
Name | Required | Type | Description |
---|---|---|---|
etcd |
false | EtcdParams |
Extra arguments for etcd. |
etcd-rivers |
false | ServiceParams |
Extra arguments for EtcdRivers. |
rivers |
false | ServiceParams |
Extra arguments for Rivers. |
kube-api |
false | APIServerParams |
Extra arguments for API server. |
kube-controller-manager |
false | ServiceParams |
Extra arguments for controller manager. |
kube-scheduler |
false | SchedulerParams |
Extra arguments for scheduler. |
kube-proxy |
false | ProxyParams |
Extra arguments for kube-proxy. |
kubelet |
false | KubeletParams |
Extra arguments for kubelet. |
Name | Required | Type | Description |
---|---|---|---|
extra_args |
false | array | Extra command-line arguments. List of strings. |
extra_binds |
false | array | Extra bind mounts. List of Mount . |
extra_env |
false | object | Extra environment variables. |
Name | Required | Type | Description |
---|---|---|---|
source |
true | string | Path in a host to a directory or a file. |
destination |
true | string | Path in the container filesystem. |
read_only |
false | bool | True to mount the directory or file as read-only. |
propagation |
false | string | Whether mounts can be propagated to replicas. |
selinux_label |
false | string | Relabel the SELinux label of the host directory. |
selinux-label
:
- If SELinux is not in enforcing mode, it will not set the SELinux label.
- "z": The mount content is shared among multiple containers.
- "Z": The mount content is private and unshared.
- This label should not be specified to system directories.
Name | Required | Type | Description |
---|---|---|---|
volume_name |
false | string | Docker volume name for data. Default: etcd-cke . |
extra_args |
false | array | Extra command-line arguments. List of strings. |
extra_binds |
false | array | Extra bind mounts. List of Mount . |
extra_env |
false | object | Extra environment variables. |
Name | Required | Type | Description |
---|---|---|---|
audit_log_enabled |
false | bool | If true, audit log will be logged to the specified path. |
audit_log_policy |
false | string | Audit policy configuration in yaml format. |
audit_log_path |
false | string | Audit log output path. Default is standard output. |
extra_args |
false | array | Extra command-line arguments. List of strings. |
extra_binds |
false | array | Extra bind mounts. List of Mount . |
extra_env |
false | object | Extra environment variables. |
Name | Required | Type | Description |
---|---|---|---|
config |
false | *v1alpha1.KubeProxyConfiguration |
See below. |
disable |
false | bool | If true, CKE will skip to install kube-proxy. |
extra_args |
false | array | Extra command-line arguments. List of strings. |
extra_binds |
false | array | Extra bind mounts. List of Mount . |
extra_env |
false | object | Extra environment variables. |
config
must be a partial v1alpha1.KubeProxyConfiguration
.
Fields in the below table have default values:
Name | Value |
---|---|
HostnameOverride |
Host name or address if the host name is empty |
MetricsBindAddress |
0.0.0.0 |
KubeProxyConntrackConfiguration.TCPEstablishedTimeout |
24h |
KubeProxyConntrackConfiguration.TCPCloseWaitTimeout |
1h |
ClientConnection.Kubeconfig
is managed by CKE and are not configurable.
Changing KubeProxyConfiguration.Mode
requires full node restarts.
Name | Required | Type | Description |
---|---|---|---|
boot_taints |
false | []Taint |
Bootstrap node taints. |
cni_conf_file |
false | CNIConfFile |
CNI configuration file. |
config |
false | *v1beta1.KubeletConfiguration |
See below. |
cri_endpoint |
false | string | Path of the runtime socket. Default: /run/containerd/containerd.sock . |
extra_args |
false | array | Extra command-line arguments. List of strings. |
extra_binds |
false | array | Extra bind mounts. List of Mount . |
extra_env |
false | object | Extra environment variables. |
Taints in boot_taints
are added to a Node in the following cases:
- when that Node is registered with Kubernetes by
kubelet
, or - when
kubelet
on that Node is being booted while the Node resource is already registered.
Those taints can be removed manually when they are no longer needed.
Note that the second case happens when the physical node is rebooted without resource manipulation.
If you want to add taints only at Node registration, use RegisterWithTaints
field in KubeletConfiguration.
config
must be a partial v1beta1.KubeletConfiguration
.
Fields that are described as This field should not be updated without a full node reboot. won't be updated on the running node for safety. Such fields include CgroupDriver
or QOSReserved
.
Fields in the below table have default values:
Name | Value |
---|---|
ClusterDomain |
cluster.local |
RuntimeRequestTimeout |
15m |
HealthzBindAddress |
0.0.0.0 |
VolumePluginDir |
/opt/volume/bin |
TLSCertFile
, TLSPrivateKeyFile
, Authentication
, Authorization
, and ClusterDNS
are managed by CKE and are not configurable.
RegisterWithTaints
is managed by CKE when boot_taints
exists in KubeletParams.
When taints with the same key are specified in both boot_taints
(KubeletParams) and RegisterWithTaints
(KubeletConfiguration), CKE respects boot_taints
.
CNI configuration file specified by cni_conf_file
will be put in /etc/cni/net.d
directory
on all nodes. The file is created only when kubelet
starts on the node; it will not be
updated later on.
Name | Required | Type | Description |
---|---|---|---|
name |
true | string | file name |
content |
true | string | file content in JSON format |
name
is the filename of CNI configuration file.
It should end with either .conf
or .conflist
.
Name | Required | Type | Description |
---|---|---|---|
config |
false | *v1.KubeSchedulerConfiguration |
See below. |
extra_args |
false | array | Extra command-line arguments. List of strings. |
extra_binds |
false | array | Extra bind mounts. List of Mount . |
extra_env |
false | object | Extra environment variables. |
config
must be a partial v1.KubeSchedulerConfiguration
.
Fields in config
may have default values. Some fields are overwritten by CKE.
Please see the source code for more details.