Replies: 6 comments 7 replies
-
Want to improve the doc for Amoro in 2025 based on Diátaxis [1] -- A systematic approach to technical documentation authoring. |
Beta Was this translation helpful? Give feedback.
-
2 Make Plan scheduler and optimizer pluggable, so that user can implement some sefl-implemented policy |
Beta Was this translation helpful? Give feedback.
-
3 Support event-based plan. after implementing an event-based plan, we will plan the table by event trigger. |
Beta Was this translation helpful? Give feedback.
-
For mixed format, we aim to support the Variant data type to better handle semi-structured data. Variant has already been introduced by Apache Spark and Delta Lake, offering significant benefits for handling such semi-structured data. While Apache Iceberg is still in the process of implementing Variant data type support, once this feature becomes available, mixed formats will need to adopt and integrate it seamlessly. |
Beta Was this translation helpful? Give feedback.
-
Currently, the approach of calculating NDV (Number of Distinct Values) immediately after data is written is employed by projects like Trino and Bodo.ai. These projects use the Theta Sketches algorithm to incrementally compute distinct values for each column after executing specific write operations, enabling query optimizers to select more efficient execution plans. Additionally, Apache Spark has introduced procedures to compute NDV, leveraging these statistics for CBO. I believe it is essential to develop a new optimizer capable of calculating NDV statistics, ensuring compatibility across different query engines. |
Beta Was this translation helpful? Give feedback.
-
Intorduce APIP(Amoro project Improvement Proposals):The purpose of APIP is to enhance the likelihood of meeting user needs by keeping the user community informed and engaged in significant improvements during the development of the Amoro codebase. APIP is intended for major user-facing or cross-domain changes, rather than minor incremental updates. |
Beta Was this translation helpful? Give feedback.
-
Reflecting on the remarkable achievements of the past year, the Apache Amoro community has made significant progress in enhancing data lake maintenance and optimization. 🎉
As we enter 2025, we remain dedicated to building on this momentum, embracing new challenges, and unlocking opportunities to empower the data ecosystem. 🚀
Beta Was this translation helpful? Give feedback.
All reactions