You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I searched in the issues and found nothing similar.
Paimon version
0.8
Compute Engine
spark
Minimal reproduce step
In our company's business, users will first delete data older than 30 days for a non-partitioned table, and then insert new data to update with the smaller sequence.field, however, the data will not be written because the deleted record's sequence.filed is more smaller. We must perform the full compaction after delete operation. And the full compaction is expensive for large data tables.
Is there any way to ensure that the delete operation is timely?
createtabletest_tb (
`req_id` STRING,
`ad_id` STRING,
`info` STRING,
`dt_seconds_asc`BIGINT
)USING paimon
TBLPROPERTIES(
'bucket'='1',
'file.compression'='ZSTD',
'file.format'='PARQUET',
'primary-key'='req_id,ad_id',
'sequence.field'='dt_seconds_asc');
insert into test_tb values('a', 'b', 'info-1', 100);
deletefrom test_tb where dt_seconds_asc <200;
insert into test_tb values('a', 'b', 'info-1', 50);
// audit log still -D rowkind, insert data '50' is useless
select*from`test_tb$audit_log`;
OK
rowkind req_id ad_id info dt_seconds_asc
-D a b info-1100// result is nullselect*from test_tb;
What doesn't meet your expectations?
Deleted data will not be included in the sequence.field comparison
Anything else?
No response
Are you willing to submit a PR?
I'm willing to submit a PR!
The text was updated successfully, but these errors were encountered:
Search before asking
Paimon version
0.8
Compute Engine
spark
Minimal reproduce step
In our company's business, users will first delete data older than 30 days for a non-partitioned table, and then insert new data to update with the smaller sequence.field, however, the data will not be written because the deleted record's sequence.filed is more smaller. We must perform the full compaction after delete operation. And the full compaction is expensive for large data tables.
Is there any way to ensure that the delete operation is timely?
What doesn't meet your expectations?
Deleted data will not be included in the sequence.field comparison
Anything else?
No response
Are you willing to submit a PR?
The text was updated successfully, but these errors were encountered: