From c0b78b452b3883758f1d39e7d2c9862a1ae27db7 Mon Sep 17 00:00:00 2001 From: dervoeti Date: Thu, 21 Sep 2023 10:01:53 +0200 Subject: [PATCH 1/2] Removed check for InitDb logs (#329) --- .../kuttl/logging/airflow-vector-aggregator-values.yaml.j2 | 4 ---- 1 file changed, 4 deletions(-) diff --git a/tests/templates/kuttl/logging/airflow-vector-aggregator-values.yaml.j2 b/tests/templates/kuttl/logging/airflow-vector-aggregator-values.yaml.j2 index d71617d7..7c9b693b 100644 --- a/tests/templates/kuttl/logging/airflow-vector-aggregator-values.yaml.j2 +++ b/tests/templates/kuttl/logging/airflow-vector-aggregator-values.yaml.j2 @@ -92,10 +92,6 @@ customConfig: condition: >- .pod == "airflow-scheduler-custom-log-config-0" && .container == "vector" - filteredAutomaticLogConfigInitDb: - type: filter - inputs: [vector] - condition: .container == "airflow-init-db" filteredInvalidEvents: type: filter inputs: [vector] From 7a3fb47c58b6da69e55de1792185ad26e2341bca Mon Sep 17 00:00:00 2001 From: Techassi Date: Thu, 21 Sep 2023 11:11:43 +0200 Subject: [PATCH 2/2] docs: Update references (#327) * Initial commit * Update xrefs --- .../pages/getting_started/installation.adoc | 14 +++++---- docs/modules/airflow/pages/index.adoc | 29 ++++++++++++++----- 2 files changed, 31 insertions(+), 12 deletions(-) diff --git a/docs/modules/airflow/pages/getting_started/installation.adoc b/docs/modules/airflow/pages/getting_started/installation.adoc index 2c5b2634..00eb91ba 100644 --- a/docs/modules/airflow/pages/getting_started/installation.adoc +++ b/docs/modules/airflow/pages/getting_started/installation.adoc @@ -25,14 +25,14 @@ WARNING: Do not use this setup in production! Supported databases and versions a There are 2 ways to run Stackable Operators -1. Using xref:stackablectl::index.adoc[] +1. Using xref:management:stackablectl:index.adoc[] 2. Using Helm === stackablectl stackablectl is the command line tool to interact with Stackable operators and our recommended way to install Operators. -Follow the xref:stackablectl::installation.adoc[installation steps] for your platform. +Follow the xref:management:stackablectl:installation.adoc[installation steps] for your platform. After you have installed stackablectl run the following command to install all Operators necessary for Airflow: @@ -49,7 +49,8 @@ The tool will show [INFO ] Installing airflow operator ---- -TIP: Consult the xref:stackablectl::quickstart.adoc[] to learn more about how to use stackablectl. For example, you can use the `-k` flag to create a Kubernetes cluster with link:https://kind.sigs.k8s.io/[kind]. +TIP: Consult the xref:management:stackablectl:quickstart.adoc[] to learn more about how to use stackablectl. For +example, you can use the `--cluster kind` flag to create a Kubernetes cluster with link:https://kind.sigs.k8s.io/[kind]. === Helm @@ -65,8 +66,11 @@ Then install the Stackable Operators: include::example$getting_started/code/getting_started.sh[tag=helm-install-operators] ---- -Helm will deploy the Operators in a Kubernetes Deployment and apply the CRDs for the Airflow cluster (as well as the CRDs for the required operators). You are now ready to deploy Apache Airflow in Kubernetes. +Helm will deploy the Operators in a Kubernetes Deployment and apply the CRDs for the Airflow cluster (as well as the +CRDs for the required operators). You are now ready to deploy Apache Airflow in Kubernetes. == What's next -xref:getting_started/first_steps.adoc[Set up an Airflow cluster] and its dependencies and xref:getting_started/first_steps.adoc#_verify_that_it_works[verify that it works] by inspecting and running an example DAG. \ No newline at end of file +xref:getting_started/first_steps.adoc[Set up an Airflow cluster] and its dependencies and +xref:getting_started/first_steps.adoc#_verify_that_it_works[verify that it works] by inspecting and running an example +DAG. \ No newline at end of file diff --git a/docs/modules/airflow/pages/index.adoc b/docs/modules/airflow/pages/index.adoc index 24febc99..7f9f1e6b 100644 --- a/docs/modules/airflow/pages/index.adoc +++ b/docs/modules/airflow/pages/index.adoc @@ -2,16 +2,23 @@ :description: The Stackable Operator for Apache Airflow is a Kubernetes operator that can manage Apache Airflow clusters. Learn about its features, resources, dependencies and demos, and see the list of supported Airflow versions. :keywords: Stackable Operator, Apache Airflow, Kubernetes, k8s, operator, engineer, big data, metadata, job pipeline, scheduler, workflow, ETL +:k8s-crs: https://kubernetes.io/docs/concepts/extend-kubernetes/api-extension/custom-resources/ + The Stackable Operator for Apache Airflow manages https://airflow.apache.org/[Apache Airflow] instances on Kubernetes. -Apache Airflow is an open-source application for creating, scheduling, and monitoring workflows. Workflows are defined as code, with tasks that can be run on a variety of platforms, including Hadoop, Spark, and Kubernetes itself. Airflow is a popular choice to orchestrate ETL workflows and data pipelines. +Apache Airflow is an open-source application for creating, scheduling, and monitoring workflows. Workflows are defined +as code, with tasks that can be run on a variety of platforms, including Hadoop, Spark, and Kubernetes itself. Airflow +is a popular choice to orchestrate ETL workflows and data pipelines. == Getting started -Get started using Airflow with the Stackable Operator by following the xref:getting_started/index.adoc[] guide. It guides you through installing the Operator alongside a PostgreSQL database and Redis instance, connecting to your Airflow instance and running your first workflow. +Get started using Airflow with the Stackable Operator by following the xref:getting_started/index.adoc[] guide. It +guides you through installing the Operator alongside a PostgreSQL database and Redis instance, connecting to your +Airflow instance and running your first workflow. == Resources -The Operator manages three https://kubernetes.io/docs/concepts/extend-kubernetes/api-extension/custom-resources/[custom resources]: The _AirflowCluster_ and _AirflowDB_. It creates a number of different Kubernetes resources based on the custom resources. +The Operator manages three {k8s-crs}[custom resources]: The _AirflowCluster_ and _AirflowDB_. It creates a number of +different Kubernetes resources based on the custom resources. === Custom resources @@ -62,9 +69,14 @@ Based on the custom resources you define, the Operator creates ConfigMaps, State image::airflow_overview.drawio.svg[A diagram depicting the Kubernetes resources created by the operator] -The diagram above depicts all the Kubernetes resources created by the operator, and how they relate to each other. The Job created for the AirflowDB is not shown. +The diagram above depicts all the Kubernetes resources created by the operator, and how they relate to each other. The +Job created for the AirflowDB is not shown. -For every xref:concepts:roles-and-role-groups.adoc#_role_groups[role group] you define, the Operator creates a StatefulSet with the amount of replicas defined in the RoleGroup. Every Pod in the StatefulSet has two containers: the main container running Airflow and a sidecar container gathering metrics for xref:operators:monitoring.adoc[]. The Operator creates a Service per role group as well as a single service for the whole `webserver` role called `-webserver`. +For every xref:concepts:roles-and-role-groups.adoc#_role_groups[role group] you define, the Operator creates a +StatefulSet with the amount of replicas defined in the RoleGroup. Every Pod in the StatefulSet has two containers: the +main container running Airflow and a sidecar container gathering metrics for xref:operators:monitoring.adoc[]. The +Operator creates a Service per role group as well as a single service for the whole `webserver` role called +`-webserver`. ConfigMaps are created, one per RoleGroup and also one for the AirflowDB. Both ConfigMaps contains two files: `log_config.py` and `webserver_config.py` which contain logging and general Airflow configuration respectively. @@ -76,11 +88,14 @@ NOTE: Redis is only needed if the executors have been set to `spec.celeryExecuto == Using custom workflows/DAGs -https://airflow.apache.org/docs/apache-airflow/stable/core-concepts/dags.html[Direct acyclic graphs (DAGs) of tasks] are the core entities you will use in Airflow. Have a look at the page on xref:usage-guide/mounting-dags.adoc[] to learn about the different ways of loading your custom DAGs into Airflow. +https://airflow.apache.org/docs/apache-airflow/stable/core-concepts/dags.html[Direct acyclic graphs (DAGs) of tasks] are +the core entities you will use in Airflow. Have a look at the page on xref:usage-guide/mounting-dags.adoc[] to learn +about the different ways of loading your custom DAGs into Airflow. == Demo -You can install the xref:stackablectl::demos/airflow-scheduled-job.adoc[] demo and explore an Airflow installation, as well as how it interacts with xref:spark-k8s:index.adoc[Apache Spark]. +You can install the xref:demos:airflow-scheduled-job.adoc[] demo and explore an Airflow installation, as +well as how it interacts with xref:spark-k8s:index.adoc[Apache Spark]. == Supported Versions