Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug] Delete data is not timely #4767

Open
1 of 2 tasks
askwang opened this issue Dec 24, 2024 · 0 comments
Open
1 of 2 tasks

[Bug] Delete data is not timely #4767

askwang opened this issue Dec 24, 2024 · 0 comments
Labels
bug Something isn't working

Comments

@askwang
Copy link
Contributor

askwang commented Dec 24, 2024

Search before asking

  • I searched in the issues and found nothing similar.

Paimon version

0.8

Compute Engine

spark

Minimal reproduce step

In our company's business, users will first delete data older than 30 days for a non-partitioned table, and then insert new data to update with the smaller sequence.field, however, the data will not be written because the deleted record's sequence.filed is more smaller. We must perform the full compaction after delete operation. And the full compaction is expensive for large data tables.

Is there any way to ensure that the delete operation is timely?

create table test_tb (
`req_id` STRING,
`ad_id` STRING,
`info` STRING,
`dt_seconds_asc` BIGINT
)USING paimon
TBLPROPERTIES(
  'bucket' = '1',
  'file.compression' = 'ZSTD',
  'file.format' = 'PARQUET',
  'primary-key' = 'req_id,ad_id',
  'sequence.field' = 'dt_seconds_asc');

insert into test_tb values('a', 'b', 'info-1', 100);

delete from test_tb where dt_seconds_asc < 200;

insert into test_tb values('a', 'b', 'info-1', 50);

// audit log still -D rowkind, insert data '50' is useless
select * from `test_tb$audit_log`;
OK
rowkind req_id  ad_id info  dt_seconds_asc
-D  a b info-1  100

// result is null
select * from test_tb;

What doesn't meet your expectations?

Deleted data will not be included in the sequence.field comparison

Anything else?

No response

Are you willing to submit a PR?

  • I'm willing to submit a PR!
@askwang askwang added the bug Something isn't working label Dec 24, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

1 participant