How to migrate from invalid chunker parameters? #7586
-
Hi, when I use the latest version of Borg packaged in Guix (updating from 1.1.11 to 1.2.3), I get For now I've rolled back to the old version that allowed those parameters, but I'm wondering what parameters should I put to allow deduplication to continue working against the chunks made with those invalid values? (I'd like to avoid having to recompress the whole repository since I've already stored several terabytes/many millions of files so re-compressing would be really inconvenient, or I'd just do that.) Thank you! |
Beta Was this translation helpful? Give feedback.
Replies: 3 comments 1 reply
-
The values are The idea is that the target chunk size is about So, in your case, the max value is lower (14) than expected (>= 16). https://github.com/borgbackup/borg/blob/1.2.4/src/borg/_chunker.c#L122 looks like the max chunk size is not directly enforced, but rather indirectly over using a buffer of that size. I just tried how the code behaves: Here are some results for 10,16,14 (cutting 10MB random into chunks) - that is how it should look like:
The chunks were always between min and max allowed (as expected) and only rarely were cut at max size. So 97% of chunks were cut as determined by buzhash value and only 3% were cut due to exceeding the max size. Here are some results for 10,14,16 (cutting 10MB random into chunks) - for your chunker params:
So, as you see here, the cut chunks were very frequently at the maximum allowed size (due to max chunk size limiting them) and only a few chunks (about 20%) were at some size as determined by the buzhash value. |
Beta Was this translation helpful? Give feedback.
-
It's a bit hard to give good advice here, because likely there is no quick and good way to resolve this. I maybe could convert the validation failure you see in borg 1.2.3 (and also 1.2.4) into a warning, so you can continue with the bad chunker params for now without having to use an old borg version. You could You could also just switch to valid chunker params and do backups and pruning for some months (will need more space in the repo due to bad dedup) and then only recreate the fewer leftover archives that still have the old chunker params. Side note: |
Beta Was this translation helpful? Give feedback.
-
Ticket: #7590, "fix" will be in borg 1.2.5. |
Beta Was this translation helpful? Give feedback.
It's a bit hard to give good advice here, because likely there is no quick and good way to resolve this.
I maybe could convert the validation failure you see in borg 1.2.3 (and also 1.2.4) into a warning, so you can continue with the bad chunker params for now without having to use an old borg version.
You could
borg recreate
all your archives with valid chunker params - this is likely going to take a long time, depending on your archives count. But this would have the advantage that then the buzhash chunking actually would work like expected. What you have now is somehow a buzhash chunker 80% degenerated into a fixed chunker. You could speed it up by first reducing the archive count to t…