Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

implement KubernetesExecutor #311

Merged
merged 54 commits into from
Aug 23, 2023
Merged
Show file tree
Hide file tree
Changes from 51 commits
Commits
Show all changes
54 commits
Select commit Hold shift + click to select a range
9762dfa
wip: using template mounted as config map
adwk67 Jul 28, 2023
0172890
wip: build config map
adwk67 Aug 1, 2023
3ffbb2f
wip: initial working configmap
adwk67 Aug 1, 2023
94065da
wip: cleanup
adwk67 Aug 1, 2023
6f8b14e
wip: clippy allow
adwk67 Aug 1, 2023
4f91708
rebuild charts
adwk67 Aug 2, 2023
e0b9730
merged main
adwk67 Aug 2, 2023
4321e77
fixed merged conflict
adwk67 Aug 2, 2023
4623cbd
fixed merged conflict
adwk67 Aug 2, 2023
0cd4824
added provisional comment re. worker pod usage
adwk67 Aug 2, 2023
3e3a304
updated changelog
adwk67 Aug 2, 2023
0fa8dec
pin serde/serde_derive at 1.0.171
adwk67 Aug 2, 2023
d6b448b
fixed typo
adwk67 Aug 2, 2023
cfa37a2
reverted executor change
adwk67 Aug 3, 2023
14bec71
added namespace for k8s executors
adwk67 Aug 3, 2023
f89eca9
add alternative env-var for older versions
adwk67 Aug 3, 2023
27814b9
simplified config map
adwk67 Aug 4, 2023
21d71ff
wip: replace role with enum
adwk67 Aug 7, 2023
965b255
flatten celery executor config
adwk67 Aug 8, 2023
6d01971
wip: adapting tests
adwk67 Aug 8, 2023
e2105ee
wip: added templating
adwk67 Aug 8, 2023
b6e2efb
working smoke/celery test
adwk67 Aug 8, 2023
41966b8
working smoke/kubernetes test
adwk67 Aug 8, 2023
6ed22bf
linting
adwk67 Aug 8, 2023
1f4800f
fix: added pre-2.5 name of env-var
adwk67 Aug 9, 2023
ecfd1cb
added env overrides for config map test
adwk67 Aug 9, 2023
c34c244
added gitysnc elements to pod template and adapted tests
adwk67 Aug 10, 2023
a892133
adapted ldap tests
adwk67 Aug 10, 2023
a8a0e33
adapted cluster-operation test
adwk67 Aug 10, 2023
fdf4114
adapted orphaned resources test
adwk67 Aug 10, 2023
67b9281
adapted resources test
adwk67 Aug 10, 2023
0f56477
adapted logging test
adwk67 Aug 10, 2023
2181d7d
make redis conditional on executor for smoke tests
adwk67 Aug 10, 2023
a890f58
added logging components to config map
adwk67 Aug 10, 2023
c0b95a4
moved template code into a function, added some comments
adwk67 Aug 11, 2023
b14f5f0
fix merge conflicts
adwk67 Aug 11, 2023
a6e255c
linting
adwk67 Aug 11, 2023
a5d87ba
updated docs and examples
adwk67 Aug 14, 2023
b628846
move constants
adwk67 Aug 14, 2023
cc7f695
specified the resources for workers
adwk67 Aug 14, 2023
7ed321a
make executor non-optional, remove enum-discriminant usage
adwk67 Aug 15, 2023
70cd55a
regenerate charts
adwk67 Aug 15, 2023
ca331ab
added note on image name and shutdown hook
adwk67 Aug 15, 2023
0d2613e
added shell capture for statefulsets
adwk67 Aug 17, 2023
e08a07c
wip:crd refactoring
adwk67 Aug 17, 2023
23983db
fixed tests, rebuilt charts
adwk67 Aug 18, 2023
d3ab0e5
rebuilt charts, adapted integration tests
adwk67 Aug 18, 2023
d612a59
further integration test fixes
adwk67 Aug 18, 2023
aa88c1d
updated docs
adwk67 Aug 18, 2023
b35966a
corrected docs
adwk67 Aug 18, 2023
eaec70b
Update rust/operator-binary/src/airflow_controller.rs
adwk67 Aug 23, 2023
86d9493
Update docs/modules/airflow/pages/index.adoc
adwk67 Aug 23, 2023
8412573
Update docs/modules/airflow/pages/getting_started/first_steps.adoc
adwk67 Aug 23, 2023
c1f43d9
input from review
adwk67 Aug 23, 2023
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,7 @@

### Added

- [BREAKING] Implement KubernetesExecutor ([#311]).
- Default stackableVersion to operator version ([#312]).

### Changed
Expand All @@ -18,6 +19,7 @@

[#303]: https://github.com/stackabletech/airflow-operator/pull/303
[#308]: https://github.com/stackabletech/airflow-operator/pull/308
[#311]: https://github.com/stackabletech/airflow-operator/pull/311
[#312]: https://github.com/stackabletech/airflow-operator/pull/312
[#316]: https://github.com/stackabletech/airflow-operator/pull/316

Expand Down
16,095 changes: 9,512 additions & 6,583 deletions deploy/helm/airflow-operator/crds/crds.yaml

Large diffs are not rendered by default.

12 changes: 12 additions & 0 deletions deploy/helm/airflow-operator/templates/roles.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -176,6 +176,18 @@ rules:
- serviceaccounts
verbs:
- get
- apiGroups:
- ""
resources:
- pods
verbs:
- create
- delete
- get
- list
- patch
- update
- watch
- apiGroups:
- events.k8s.io
resources:
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,6 @@ spec:
image:
productVersion: 2.6.1
clusterConfig:
executor: CeleryExecutor
loadExamples: false
exposeConfig: false
credentialsSecret: simple-airflow-credentials
Expand All @@ -25,7 +24,7 @@ spec:
envOverrides:
AIRFLOW__CORE__DAGS_FOLDER: "/dags" # <8>
replicas: 1
workers:
celeryExecutors:
roleGroups:
default:
envOverrides:
Expand Down
1 change: 0 additions & 1 deletion docs/modules/airflow/examples/example-airflow-gitsync.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,6 @@ spec:
image:
productVersion: "2.6.1"
clusterConfig:
executor: CeleryExecutor
loadExamples: false
exposeConfig: false
credentialsSecret: test-airflow-credentials # <1>
Expand Down
3 changes: 1 addition & 2 deletions docs/modules/airflow/examples/example-airflow-incluster.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,6 @@ spec:
image:
productVersion: 2.6.1
clusterConfig:
executor: CeleryExecutor
loadExamples: false
exposeConfig: false
credentialsSecret: simple-airflow-credentials
Expand All @@ -17,7 +16,7 @@ spec:
envOverrides:
AIRFLOW_CONN_KUBERNETES_IN_CLUSTER: "kubernetes://?__extra__=%7B%22extra__kubernetes__in_cluster%22%3A+true%2C+%22extra__kubernetes__kube_config%22%3A+%22%22%2C+%22extra__kubernetes__kube_config_path%22%3A+%22%22%2C+%22extra__kubernetes__namespace%22%3A+%22%22%7D"
replicas: 1
workers:
celeryExecutors:
roleGroups:
default:
envOverrides:
Expand Down
10 changes: 8 additions & 2 deletions docs/modules/airflow/examples/getting_started/code/airflow.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -7,18 +7,24 @@ spec:
image:
productVersion: 2.6.1
clusterConfig:
executor: CeleryExecutor
loadExamples: true
exposeConfig: false
credentialsSecret: simple-airflow-credentials
webservers:
roleGroups:
default:
replicas: 1
workers:
celeryExecutors:
roleGroups:
default:
replicas: 2
config:
resources:
cpu:
min: 400m
max: 800m
memory:
limit: 2Gi
schedulers:
roleGroups:
default:
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -7,18 +7,24 @@ spec:
image:
productVersion: 2.6.1
clusterConfig:
executor: CeleryExecutor
loadExamples: true
exposeConfig: false
credentialsSecret: simple-airflow-credentials
webservers:
roleGroups:
default:
replicas: 1
workers:
celeryExecutors:
roleGroups:
default:
replicas: 2
config:
resources:
cpu:
min: 400m
max: 800m
memory:
limit: 2Gi
schedulers:
roleGroups:
default:
Expand Down
5 changes: 1 addition & 4 deletions docs/modules/airflow/pages/getting_started/first_steps.adoc
Original file line number Diff line number Diff line change
Expand Up @@ -56,14 +56,11 @@ Where:

- `metadata.name` contains the name of the Airflow cluster
- the label of the Docker image provided by Stackable must be set in `spec.version`
- `spec.executor`: this setting determines how the cluster will run (for more information see https://airflow.apache.org/docs/apache-airflow/stable/executor/index.html#executor-types): the `CeleryExecutor`
is the recommended setting although `SequentialExecutor` (all jobs run in one process in series) and `LocalExecutor`
(whereby all jobs are run on one node, using whatever parallelism is possible) are also supported
- `spec.executor`: this setting determines how the cluster will run (for more information see https://airflow.apache.org/docs/apache-airflow/stable/executor/index.html#executor-types). This field takes one of two values, `celery` or `kubernetes`: `celery` will deploy workers as defined by the role and specify `CeleryExecutor` internally. `kubernetes` will set `KubernetesExecutor` internally, and the scheduler will take responsibility for creating (and terminating) pods for the DAG individual tasks.
adwk67 marked this conversation as resolved.
Show resolved Hide resolved
- the `spec.loadExamples` key is optional and defaults to `false`. It is set to `true` here as the example DAGs will be used when verifying the installation.
- the `spec.exposeConfig` key is optional and defaults to `false`. It is set to `true` only as an aid to verify the configuration and should never be used as such in anything other than test or demo clusters.
- the previously created secret must be referenced in `spec.credentialsSecret`


NOTE: Please note that the version you need to specify for `spec.version` is not only the version of Apache Airflow which you want to roll out, but has to be amended with a Stackable version as shown. This Stackable version is the version of the underlying container image which is used to execute the processes. For a list of available versions please check our
https://repo.stackable.tech/#browse/browse:docker:v2%2Fstackable%airflow%2Ftags[image registry].
It should generally be safe to simply use the latest image version that is available.
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@ On this page you will install the Stackable Airflow Operator, the software that

== Required external components: Postgresql and Redis

Postgresql and Redis are required by Airflow: Postgresql to store metadata about DAG runs, and Redis to schedule and/or queue DAG jobs. They are components that may well already be available for customers, in which case we treat them here as pre-requisites for an airflow cluster and hence as part of the installation process. These components will be installed using Helm. Note that specific versions are declared:
Postgresql is required by Airflow to store metadata about DAG runs, and Redis is required by the Celery executor to schedule and/or queue DAG jobs. They are components that may well already be available for customers, in which case we treat them here as pre-requisites for an airflow cluster and hence as part of the installation process. These components will be installed using Helm. Note that specific versions are declared:

[source,bash]
----
Expand Down
41 changes: 40 additions & 1 deletion docs/modules/airflow/pages/index.adoc
Original file line number Diff line number Diff line change
Expand Up @@ -15,10 +15,47 @@ The Operator manages three https://kubernetes.io/docs/concepts/extend-kubernetes

=== Custom resources

The AirflowCluster is the main resource for the configuration of the Airflow instance. The resource defines three xref:concepts:roles-and-role-groups.adoc[roles]: `webserver`, `worker` and `scheduler`. The various configuration options are explained in the xref:usage-guide/index.adoc[]. It helps you tune your cluster to your needs by configuring xref:usage-guide/storage-resources.adoc[resource usage], xref:usage-guide/security.adoc[security], xref:usage-guide/logging.adoc[logging] and more.
The AirflowCluster is the main resource for the configuration of the Airflow instance. The resource defines three xref:concepts:roles-and-role-groups.adoc[roles]: `webserver`, `worker` and `scheduler` (the `worker` role is embedded within `spec.celeryExecutors`: this is described in the next section). The various configuration options are explained in the xref:usage-guide/index.adoc[]. It helps you tune your cluster to your needs by configuring xref:usage-guide/storage-resources.adoc[resource usage], xref:usage-guide/security.adoc[security], xref:usage-guide/logging.adoc[logging] and more.

When an AirflowCluster is first deployed, an AirflowDB resource is created. The AirflowDB resource is a wrapper resource for the metadata SQL database that is used by Airflow to store information on users and permissions as well as workflows, task instances and their execution. The resource contains some configuration but also keeps track of whether the database has been initialized or not. It is not deleted automatically if a AirflowCluster is deleted, and so can be reused.

=== Executors

The `worker` role is deployed when `spec.celeryExecutors` is specified (the alternative is `spec.kubernetesExecutors`, whereby pods are created dynamically as needed without jobs being routed through a redis queue to the workers). This means that for `celeryExecutors` there exists an implicit single role which does not appear in resource definition. This is illustrated below:
adwk67 marked this conversation as resolved.
Show resolved Hide resolved

==== `celeryExecutors`

[source,yaml]
----
spec:
...
celeryExecutors:
roleGroups:
default:
envOverrides:
...
configOverrides:
...
replicas: 2
config:
logging:
...
----

==== `kubernetesExecutors`

[source,yaml]
----
spec:
...
kubernetesExecutors:
config:
logging:
...
resources:
...
----

=== Kubernetes resources

Based on the custom resources you define, the Operator creates ConfigMaps, StatefulSets and Services.
Expand All @@ -35,6 +72,8 @@ ConfigMaps are created, one per RoleGroup and also one for the AirflowDB. Both C

Airflow requires an SQL database in which to store its metadata as well as Redis for job execution. The xref:required-external-components.adoc[required external components] page lists all supported databases and Redis versions to use in production. You need to provide these components for production use, but the xref:getting_started/index.adoc[] guides you through installing an example database and Redis instance with an Airflow instance that you can use to get started.

NOTE: Redis is only needed if the executors have been set to `spec.celeryExecutors` as the jobs will be queued via Redis before being assigned to a `worker` pod. When using `spec.kubernetesExecutors` the scheduler will take direct responsibility for this.

== Using custom workflows/DAGs

https://airflow.apache.org/docs/apache-airflow/stable/core-concepts/dags.html[Direct acyclic graphs (DAGs) of tasks] are the core entities you will use in Airflow. Have a look at the page on xref:usage-guide/mounting-dags.adoc[] to learn about the different ways of loading your custom DAGs into Airflow.
Expand Down
3 changes: 1 addition & 2 deletions examples/simple-airflow-cluster-dags-cmap.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -84,7 +84,6 @@ spec:
productVersion: 2.6.1
stackableVersion: 0.0.0-dev
clusterConfig:
executor: CeleryExecutor
loadExamples: false
exposeConfig: false
credentialsSecret: simple-airflow-credentials
Expand All @@ -102,7 +101,7 @@ spec:
envOverrides:
AIRFLOW__CORE__DAGS_FOLDER: "/dags"
replicas: 1
workers:
celeryExecutors:
roleGroups:
default:
envOverrides:
Expand Down
3 changes: 1 addition & 2 deletions examples/simple-airflow-cluster-ldap-insecure-tls.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -150,7 +150,6 @@ spec:
productVersion: 2.6.1
stackableVersion: 0.0.0-dev
clusterConfig:
executor: CeleryExecutor
loadExamples: true
exposeConfig: true
credentialsSecret: airflow-with-ldap-server-veri-tls-credentials
Expand All @@ -161,7 +160,7 @@ spec:
roleGroups:
default:
replicas: 1
workers:
celeryExecutors:
roleGroups:
default:
replicas: 1
Expand Down
3 changes: 1 addition & 2 deletions examples/simple-airflow-cluster-ldap.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -148,7 +148,6 @@ spec:
productVersion: 2.6.1
stackableVersion: 0.0.0-dev
clusterConfig:
executor: CeleryExecutor
loadExamples: true
exposeConfig: true
credentialsSecret: airflow-with-ldap-server-veri-tls-credentials
Expand All @@ -159,7 +158,7 @@ spec:
roleGroups:
default:
replicas: 1
workers:
celeryExecutors:
roleGroups:
default:
replicas: 1
Expand Down
3 changes: 1 addition & 2 deletions examples/simple-airflow-cluster.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -24,15 +24,14 @@ spec:
productVersion: 2.6.1
stackableVersion: 0.0.0-dev
clusterConfig:
executor: CeleryExecutor
loadExamples: true
exposeConfig: false
credentialsSecret: simple-airflow-credentials
webservers:
roleGroups:
default:
replicas: 1
workers:
celeryExecutors:
roleGroups:
default:
replicas: 2
Expand Down
25 changes: 14 additions & 11 deletions rust/crd/src/affinity.rs
Original file line number Diff line number Diff line change
Expand Up @@ -51,36 +51,38 @@ mod tests {
use crate::{AirflowCluster, AirflowRole};

#[rstest]
#[case(AirflowRole::Worker)]
// #[case(AirflowRole::Worker)]
#[case(AirflowRole::Scheduler)]
#[case(AirflowRole::Webserver)]
fn test_affinity_defaults(#[case] role: AirflowRole) {
let input = r#"
let cluster = "
apiVersion: airflow.stackable.tech/v1alpha1
kind: AirflowCluster
metadata:
name: airflow
spec:
image:
productVersion: 2.6.1
executor: CeleryExecutor
loadExamples: true
exposeConfig: false
credentialsSecret: simple-airflow-credentials
webservers:
roleGroups:
default:
replicas: 1
workers:
celeryExecutors:
roleGroups:
default:
replicas: 2
schedulers:
roleGroups:
default:
replicas: 1
"#;
let airflow: AirflowCluster = serde_yaml::from_str(input).expect("illegal test input");
";

let deserializer = serde_yaml::Deserializer::from_str(cluster);
let airflow: AirflowCluster =
serde_yaml::with::singleton_map_recursive::deserialize(deserializer).unwrap();

let rolegroup_ref = RoleGroupRef {
cluster: ObjectRef::from_obj(&airflow),
Expand Down Expand Up @@ -150,23 +152,22 @@ mod tests {

#[test]
fn test_affinity_legacy_node_selector() {
let input = r#"
let cluster = "
apiVersion: airflow.stackable.tech/v1alpha1
kind: AirflowCluster
metadata:
name: airflow
spec:
image:
productVersion: 2.6.1
executor: CeleryExecutor
loadExamples: true
exposeConfig: false
credentialsSecret: simple-airflow-credentials
webservers:
roleGroups:
default:
replicas: 1
workers:
celeryExecutors:
roleGroups:
default:
replicas: 2
Expand All @@ -183,9 +184,11 @@ mod tests {
values:
- antarctica-east1
- antarctica-west1
"#;
";

let airflow: AirflowCluster = serde_yaml::from_str(input).expect("illegal test input");
let deserializer = serde_yaml::Deserializer::from_str(cluster);
let airflow: AirflowCluster =
serde_yaml::with::singleton_map_recursive::deserialize(deserializer).unwrap();

let expected: StackableAffinity = StackableAffinity {
node_affinity: Some(NodeAffinity {
Expand Down
Loading