Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Enhancement](good-first-issue) Support some compress functions #45530

Open
2 of 3 tasks
zclllyybb opened this issue Dec 17, 2024 · 4 comments
Open
2 of 3 tasks

[Enhancement](good-first-issue) Support some compress functions #45530

zclllyybb opened this issue Dec 17, 2024 · 4 comments
Assignees

Comments

@zclllyybb
Copy link
Contributor

Search before asking

  • I had searched in the issues and found no similar issues.

Description

we need implement the functions compress and uncompress just like mysql did.

Solution

To support a function, we need:

function implementation and registration in BE. the utils be/src/util/block_compression.cpp#ZlibBlockCompression may be very useful. we could only support String->String singature. so you could encode the ColumnString to Slice and use the util to compress it. vice versa.
function signature and visitor for nereids planner in FE
the constant fold implementation in FE, behave just same with it in BE, like functions/executable/NumericArithmetic.java in https://github.com/apache/doris/pull/40744/files
another function docs pr in https://github.com/apache/doris-website
you can reference pr like https://github.com/apache/doris/pull/33005/files

and ATTN! we must have enough testcases, like what https://github.com/apache/doris/pull/40462/files did.

Are you willing to submit PR?

  • Yes I am willing to submit a PR!

Code of Conduct

@HashHaran
Copy link

Hi @zclllyybb, I can pick this issue and work on it. You can allot it to me. Thanks :)

@zclllyybb
Copy link
Contributor Author

Hi @zclllyybb, I can pick this issue and work on it. You can allot it to me. Thanks :)

welcome! just ask me if you have any question

@HashHaran
Copy link

Hi @zclllyybb, there are many compression algorithms implemented in the block_compression.cpp file. Do we want to support multiple compression algorithms for this function with function signature like COMPRESS(String, compression_algorithm)?
UPDATE: I have gone through your function_uuid PR. I also went through some more function implementations in the code base. I will start my implementation soon.

@zclllyybb
Copy link
Contributor Author

Hi @zclllyybb, there are many compression algorithms implemented in the block_compression.cpp file. Do we want to support multiple compression algorithms for this function with function signature like COMPRESS(String, compression_algorithm)? UPDATE: I have gone through your function_uuid PR. I also went through some more function implementations in the code base. I will start my implementation soon.

We dont need to support the second argument. There's two reasons:

  1. we should keep compatible with MySQL. so that it will be easier for users to migrate
  2. these functions aim to compress information. since the compress algorithms don't differ very much. user may not be very clear about the different of algorithm's choice. so support with the only algorithm will be better.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants