Skip to content
This repository has been archived by the owner on Jan 29, 2024. It is now read-only.

HowTo topics for Kafka tiered storage #2162

Merged
merged 22 commits into from
Nov 6, 2023
Merged
Show file tree
Hide file tree
Changes from 3 commits
Commits
Show all changes
22 commits
Select commit Hold shift + click to select a range
ea3b43d
HowTo topics for Kafka tiered storage
harshini-rangaswamy Sep 28, 2023
8da73cb
Fixed vale errors
harshini-rangaswamy Sep 28, 2023
ea53106
Fixed vale errors
harshini-rangaswamy Sep 28, 2023
c756267
Address review feedback from Roope
harshini-rangaswamy Oct 2, 2023
ae6bc71
Fixed broken link
harshini-rangaswamy Oct 2, 2023
5b03ddb
Added information to enable tiered storage per topic via CLI
harshini-rangaswamy Oct 4, 2023
275484c
Added a note
harshini-rangaswamy Oct 19, 2023
53e274b
Updated content based on UI changes
harshini-rangaswamy Oct 20, 2023
54e4de9
modify topic title
harshini-rangaswamy Oct 20, 2023
5c46b5c
Added warning about remote data when service is powered off
harshini-rangaswamy Oct 20, 2023
8539c2d
Update docs/products/kafka/howto/kafka-tiered-storage-get-started.rst
harshini-rangaswamy Nov 3, 2023
c04a982
updated intro
harshini-rangaswamy Nov 3, 2023
d13a248
Updated cross-link
harshini-rangaswamy Nov 3, 2023
ebd2e7a
Merge branch 'main' into harshini-kafka-tiered-storage-console
harshini-rangaswamy Nov 3, 2023
b6ef123
Update docs/products/kafka/howto/tiered-storage-overview.rst
harshini-rangaswamy Nov 6, 2023
34a6216
Update docs/products/kafka/howto/tiered-storage-overview.rst
harshini-rangaswamy Nov 6, 2023
4fba9fd
Update docs/products/kafka/howto/tiered-storage-overview.rst
harshini-rangaswamy Nov 6, 2023
2bf4d35
Update docs/products/kafka/howto/tiered-storage-overview.rst
harshini-rangaswamy Nov 6, 2023
72b956d
Updated feedback changes
harshini-rangaswamy Nov 6, 2023
c270686
Fixed broken link
harshini-rangaswamy Nov 6, 2023
83b0123
Fixed broken link
harshini-rangaswamy Nov 6, 2023
69f2489
Address feedback
harshini-rangaswamy Nov 6, 2023
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions .github/vale/dicts/aiven.dic
Original file line number Diff line number Diff line change
Expand Up @@ -79,6 +79,8 @@ failover
fileset
filesets
Flink
Forecast
Forecasted
FusionAuth
Gantt
geocoder
Expand Down
10 changes: 10 additions & 0 deletions _toc.yml
Original file line number Diff line number Diff line change
Expand Up @@ -391,6 +391,16 @@ entries:
- file: docs/products/kafka/howto/get-topic-partition-details
- file: docs/products/kafka/howto/schema-registry
- file: docs/products/kafka/howto/change-retention-period
- file: docs/products/kafka/howto/kafka-tiered-storage-get-started
title: Tiered storage
entries:
- file: docs/products/kafka/howto/enable-kafka-tiered-storage
title: Enable tiered storage
- file: docs/products/kafka/howto/configure-topic-tiered-storage
title: Configure tiered storage for topic
- file: docs/products/kafka/howto/tiered-storage-overview
title: Tiered storage overview
harshini-rangaswamy marked this conversation as resolved.
Show resolved Hide resolved


- file: docs/products/kafka/reference
title: Reference
Expand Down
57 changes: 57 additions & 0 deletions docs/products/kafka/howto/configure-topic-tiered-storage.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,57 @@
Configuring tiered storage for topics
===========================================================================

Aiven for Apache Kafka® offers flexibility in configuring tiered storage and setting retention policies. This guide will walk you through the process of configuring tiered storage for individual topics, configuring local retention policies, and configuring retention policies at the service level.

.. important::

Aiven for Apache Kafka® tiered storage is an early availability feature, which means it has some restrictions on the functionality and service level agreement. It is intended for non-production environments, but you can test it with production-like workloads to assess the performance. To enable this feature, navigate to the :doc:`Feature preview </docs/platform/howto/feature-preview>` page within your user profile.

Prerequisite
------------
* Tiered storage enabled for the Aiven for Apache Kafka service.

Configure tiered storage for topics
------------------------------------

1. Access `Aiven console <https://console.aiven.io/>`_, select your project, and choose your Aiven for Apache Kafka service.
2. From the left sidebar, select **Topics**.
3. Here, you have the option to either add a new topic with tiered storage configuration or modify an existing topic to use tiered storage.

For a new topic
~~~~~~~~~~~~~~~

1. From the **Topics** page, select **Add topic**.
2. Enable advanced configurations by setting the **Do you want to enable advanced configuration?** option to **Yes**.
3. In the **Topic advanced configuration** drop-down, choose ``remote_storage_enable``. This action will reveal the **Remote storage enabled** drop-down.
4. Select **True** to activate tiered storage for the topic.

.. note::
If you leave the value as **Default**, it implies that tiered storage is enabled for this topic.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought the default was not to use it, interesting.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just tried setting Remote storage enable to Default, and it does nothing. But this is going to be an obvious question from the user. Why have Default in the drop-down when it does nothing? Is there a way to remove it?
For now, I will remove this note.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah i think so. Default in the topic configs means "not set", meaning it's as if it was never touched by user. (Maybe it should be "Not set" instead of "Default" 🤔) In this case, as you need to explicitly enable remote storage per topic, the default is that it's not enabled. Does that make sense?


5. Additionally, you can also set the values for ``local_retention_ms`` and ``local_retention_bytes`` using the respective options from the drop-down list.

.. important::
If the values for ``local_retention_bytes`` and ``local_retention_ms`` are not set, they default to -2 or take the configuration from the service level.

When set to -2, the retention in local storage will match the total retention. In this scenario, the data segments sent to remote storage are also retained locally.The remote storage will contain older data segments than in the local storage only when the total retention is set to be greater than the local retention.
harshini-rangaswamy marked this conversation as resolved.
Show resolved Hide resolved

6. Select **Add topic** to save your changes and add the topic with tiered storage.

For an existing topic
~~~~~~~~~~~~~~~~~~~~~

1. From the **Topics** page, select the topic for which you wish to enable tiered storage.
2. Use the ellipsis or open the topic and choose **Modify**.
3. In the **Modify** page, choose ``remote_storage_enable`` from the drop-down list, followed by selecting **True** from the **Remote storage enable** drop-down.
4. Additionally, you can also set the values for ``local_retention_ms`` and ``local_retention_bytes`` using the respective options from the drop-down list.

.. important::
If the values for ``local_retention_bytes`` and ``local_retention_ms`` are not set, they default to -2 or take the configuration from the service level.

When set to -2, the retention in local storage will match the total retention. In this scenario, the data segments sent to remote storage are also retained locally.The remote storage will contain older data segments than in the local storage only when the total retention is set to be greater than the local retention.
harshini-rangaswamy marked this conversation as resolved.
Show resolved Hide resolved


5. Select **Update** to save your changes and activate tiered storage.


92 changes: 92 additions & 0 deletions docs/products/kafka/howto/enable-kafka-tiered-storage.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,92 @@
Enable tiered storage for Aiven for Apache Kafka®
=====================================================
Learn how to enable tiered storage capability of Aiven for Apache Kafka®. This topic provides step-by-step instructions for maximizing storage efficiency using either the `Aiven console <https://console.aiven.io/>`_ or the :doc:`Aiven CLI </docs/tools/cli>`.

.. important::

Aiven for Apache Kafka® tiered storage is an early availability feature, which means it has some restrictions on the functionality and service level agreement. It is intended for non-production environments, but you can test it with production-like workloads to assess the performance. To enable this feature, navigate to the :doc:`Feature preview </docs/platform/howto/feature-preview>` page within your user profile.

Prerequisites
--------------
* Aiven account and a project set up in the Aiven Console
* Aiven for Apache Kafka® service with Apache Kafka version 3.6
* Aiven CLI


Enable tiered storage via Aiven Console
harshini-rangaswamy marked this conversation as resolved.
Show resolved Hide resolved
------------------------------------------
Follow these steps to enable tiered storage for your service using the Aiven Console.

1. Access the `Aiven console <https://console.aiven.io/>`_, and select your project.
2. Create a new :doc:`Aiven for Apache Kafka service </docs/platform/howto/create_new_service>` or choose an existing one.

- If you are creating a new service:

a. On the **Create Apache Kafka® service** page, scroll down to the **Tiered storage** section.
b. To enable tiered storage, select the **Enable tiered storage** toggle.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we mention that the price can be seen in the overview section?


- If you are using an existing service:

a. Go to the service's **Overview** page, scroll down to the **Tiered storage** section.
b. To enable tiered storage, select the **Enable tiered storage** toggle.

3. Enter the value for **Local cache size (bytes)**. This value indicates the amount of memory reserved for storing frequently accessed data, enhancing the speed of reading data from remote storage.
harshini-rangaswamy marked this conversation as resolved.
Show resolved Hide resolved

4. Select the **Activate tiered storage** to save your settings and enable tiered storage for the service.


Configuring default retention policies at service-level
harshini-rangaswamy marked this conversation as resolved.
Show resolved Hide resolved
`````````````````````````````````````````````````````````````````````````````

1. Access `Aiven console <https://console.aiven.io/>`_, select your project, and choose your Aiven for Apache Kafka service.
2. On the **Overview** page, navigate to **Advanced configuration** and select **Change**.
3. In the **Edit advanced configuration** view, choose **Add configuration option**.
4. To set the retention policy for Aiven for Apache Kafka tiered storage, select ``kafka.log_local_retention_ms`` for time-specific retention or ``kafka.log_local_retention_bytes`` for size-specific retention.
5. Select **Save advanced configuration** to apply your changes.


Enable tiered storage via Aiven CLI
-----------------------------------------
Follow these steps to enable tiered storage for your Aiven for Apache Kafka service using the:doc:`Aiven CLI </docs/tools/cli>`:

1. Retrieve the project information using the following command:

.. code-block:: bash

avn project details


If you need details for a specific project, use:

.. code-block:: bash

avn project details --project <your_project_name>

2. Get the name of the Aiven for the Apache Kafka service for which you want to enable tiered storage by using the following command:

.. code-block:: bash

avn service list

Make a note of the ``SERVICE_NAME`` corresponding to your Kafka service.

3. Enable tiered storage using the command below:

.. code-block:: bash

avn service update --project demo-kafka-project demo-kafka-service -c
tiered_storage.enabled=true -c tiered_storage.local_cache.size=5368709120
harshini-rangaswamy marked this conversation as resolved.
Show resolved Hide resolved


Where:

- ``--project demo-kafka-project``: Specifies the project name, in this example ``demo-kafka-project``.
- ``demo-kafka-service``: Refers to the Kafka service you're updating, in this example ``demo-kafka-service``.
- ``-c tiered_storage.enabled=true``: Enables tiered storage for the Kafka service.
- ``-c tiered_storage.local_cache.size=5368709120``: Sets the local cache size for tiered storage, in this example to 5 GB.






39 changes: 39 additions & 0 deletions docs/products/kafka/howto/kafka-tiered-storage-get-started.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,39 @@

Get started with Aiven for Apache Kafka® tiered storage
====================================================================

Aiven for Apache Kafka®'s tiered storage expands storage beyond local disks. It stores frequently accessed data on faster tiers and less active data on cost-effective, slower tiers, ensuring both performance and cost efficiency are optimized.
ivanyu marked this conversation as resolved.
Show resolved Hide resolved
harshini-rangaswamy marked this conversation as resolved.
Show resolved Hide resolved

For an in-depth understanding of tiered storage, how it works, and its benefits, see `Tiered Storage in Aiven for Apache Kafka®`.
Copy link
Member

@roope-kar roope-kar Oct 2, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tiered storage* (i think?)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess it's the name of a section? I can't find it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This content is related to the concept topics in the PR - #2142. I'm unable to create direct links at the moment, as they would lead to broken links and prevent me from merging the content into the main branch.


.. important::

Aiven for Apache Kafka® tiered storage is an early availability feature, which means it has some restrictions on the functionality and service level agreement. It is intended for non-production environments, but you can test it with production-like workloads to assess the performance. To enable this feature, navigate to the :doc:`Feature preview </docs/platform/howto/feature-preview>` page within your user profile.
ivanyu marked this conversation as resolved.
Show resolved Hide resolved

Enable tiered storage for service
----------------------------------
To use tiered storage, you need to enable it for your Aiven for Apache Kafka service® service. This foundational step ensures that the necessary infrastructure is in place.

For a step-by-step instructions, see :doc:`Enable tiered storage for Aiven for Apache Kafka® </docs/products/kafka/howto/enable-kafka-tiered-storage>`.

Check failure on line 17 in docs/products/kafka/howto/kafka-tiered-storage-get-started.rst

View workflow job for this annotation

GitHub Actions / vale

[vale] docs/products/kafka/howto/kafka-tiered-storage-get-started.rst#L17

[Aiven.common_replacements] Use 'Kafka' instead of 'kafka'.
Raw output
{"message": "[Aiven.common_replacements] Use 'Kafka' instead of 'kafka'.", "location": {"path": "docs/products/kafka/howto/kafka-tiered-storage-get-started.rst", "range": {"start": {"line": 17, "column": 110}}}, "severity": "ERROR"}

Check failure on line 17 in docs/products/kafka/howto/kafka-tiered-storage-get-started.rst

View workflow job for this annotation

GitHub Actions / vale

[vale] docs/products/kafka/howto/kafka-tiered-storage-get-started.rst#L17

[Aiven.common_replacements] Use 'Kafka' instead of 'kafka'.
Raw output
{"message": "[Aiven.common_replacements] Use 'Kafka' instead of 'kafka'.", "location": {"path": "docs/products/kafka/howto/kafka-tiered-storage-get-started.rst", "range": {"start": {"line": 17, "column": 129}}}, "severity": "ERROR"}

.. important::
Tiered storage is supported on Aiven for Apache Kafka services with Apache Kafka version 3.6.


Configure tiered storage per topic
----------------------------------
Once the tiered storage is enabled at the service level, you can configure it for individual topics. In the Aiven for Apache Kafka Topics page, topics using tiered storage will display **Active** in the **Tiered storage** column.

For detailed instructions, see :doc:`Configuring tiered storage for topics </docs/products/kafka/howto/configure-topic-tiered-storage>`.

Check failure on line 27 in docs/products/kafka/howto/kafka-tiered-storage-get-started.rst

View workflow job for this annotation

GitHub Actions / vale

[vale] docs/products/kafka/howto/kafka-tiered-storage-get-started.rst#L27

[Aiven.common_replacements] Use 'Kafka' instead of 'kafka'.
Raw output
{"message": "[Aiven.common_replacements] Use 'Kafka' instead of 'kafka'.", "location": {"path": "docs/products/kafka/howto/kafka-tiered-storage-get-started.rst", "range": {"start": {"line": 27, "column": 92}}}, "severity": "ERROR"}


Tiered storage usage overview
------------------------------
Gain insights into tiered storage usage from the **Tiered Storage Overview** page in your Aiven for Apache Kafka service. This includes details on billing, settings, and specific storage aspects.

For more information, see :doc:`Tiered Storage Overview in Aiven Console </docs/products/kafka/howto/tiered-storage-overview>`.

Check failure on line 34 in docs/products/kafka/howto/kafka-tiered-storage-get-started.rst

View workflow job for this annotation

GitHub Actions / vale

[vale] docs/products/kafka/howto/kafka-tiered-storage-get-started.rst#L34

[Aiven.common_replacements] Use 'Kafka' instead of 'kafka'.
Raw output
{"message": "[Aiven.common_replacements] Use 'Kafka' instead of 'kafka'.", "location": {"path": "docs/products/kafka/howto/kafka-tiered-storage-get-started.rst", "range": {"start": {"line": 34, "column": 90}}}, "severity": "ERROR"}





66 changes: 66 additions & 0 deletions docs/products/kafka/howto/tiered-storage-overview.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,66 @@
Tiered storage overview in Aiven Console
========================================

Aiven for Apache Kafka® offers a comprehensive overview of tiered storage, allowing you to understand its usage and make informed decisions. This overview provides insights into various aspects of tiered storage, including billing, settings, and storage details.

.. important::

Aiven for Apache Kafka® tiered storage is an early availability feature, which means it has some restrictions on the functionality and service level agreement. It is intended for non-production environments, but you can test it with production-like workloads to assess the performance. To enable this feature, navigate to the :doc:`Feature preview </docs/platform/howto/feature-preview>` page within your user profile.


Access tiered storage overview
--------------------------------

1. In the Aiven Console, choose your project and select your Aiven for Apache Kafka service.
2. From the left sidebar, select **Tiered Storage**. This action will display an overview of tiered storage and its associated details.


Key insights of tiered storage
------------------------------

Get a quick snapshot of the essential metrics and details related to tiered storage:

- **Current billing expenses in USD**: Stay informed about your current tiered storage expenses.
- **Forecasted month cost in USD**: Estimate your upcoming monthly costs based on current usage.
- **Data tiered in bytes**: View the volume of data that has been tiered.
harshini-rangaswamy marked this conversation as resolved.
Show resolved Hide resolved
- **Storage overview**: Understand the specifics of the storage mediums in use, including details about the used object storage and SSD storage.


Current tiered storage configurations
---------------------------------------------

View of the current configurations for tiered storage:

- **Local Cache**: Shows the current cache configuration.
- **Default Local Retention Time (ms)**: Shows the current local data retention set in milliseconds.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wondering if these should mention the actual setting name? 🤔

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I used the name the user would see on the "Tiered storage overview" page. Do you mean add this - kafka.log_local_retention_bytes and kafka.log_local_retention_ms?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, those are the actual settings names. I'm just thinking that if for example you use the CLI, you would use the real setting names so it could be reassuring that we are in fact talking about the exact same service level settings.

I think we should just use the real setting name in console as well to avoid confusion.

- **Default Local Retention Bytes**: Shows the configured volume of data, in bytes, for local retention.


To modify these settings:

1. In the **Tiered storage settings** section, select the ellipsis (three dots) and choose **Update tiered storage settings**.
2. Within **Update tiered storage settings** page, adjust the values for:

- Local Cache
- Default Local Retention Time (ms)
- Default Local Retention Bytes
3. Confirm by selecting **Save changes**.



Graphical view of tiered storage costs
------------------------------------------

Gain a visual understanding of your tiered storage expenses:

- **Hourly expense**: Visualize your hourly expenses through graphical representation.
- **Total Cost and forecast**: Get a clear picture of your overall costs and receive a forecast based on current trends.
harshini-rangaswamy marked this conversation as resolved.
Show resolved Hide resolved

Detailed storage overview
harshini-rangaswamy marked this conversation as resolved.
Show resolved Hide resolved
-------------------------

Explore the specifics of your storage usage and configurations:
harshini-rangaswamy marked this conversation as resolved.
Show resolved Hide resolved

- **Used object storage and SSD storage**: Dive deep into the storage mediums in use.
harshini-rangaswamy marked this conversation as resolved.
Show resolved Hide resolved
- **Filter by topic**: Narrow down your view to specific topics for focused insights.

Loading