Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Eliminate whole file chunks #323

Merged
merged 2 commits into from
Apr 5, 2022
Merged

Conversation

e3krisztian
Copy link
Contributor

resolves #303

@qkaiser
Copy link
Contributor

qkaiser commented Mar 31, 2022

I tested it with unblob-handlers in different configurations and it works as expected. I'd leave the review to either @vlaci or @martonilles 😎

@e3krisztian e3krisztian force-pushed the eliminate_whole_file_chunks branch 2 times, most recently from b459170 to 9d2e90d Compare March 31, 2022 16:16
@martonilles
Copy link
Contributor

We have had a conversation about this and the current branch removes also intermediate extracted files which should be kept. We just want to skip the carving, but not the extraction part.

Probably it would make sense to separate more explicitly the carve and extract phases.

@e3krisztian e3krisztian force-pushed the eliminate_whole_file_chunks branch 2 times, most recently from 97e70d4 to 92f7b55 Compare April 1, 2022 09:59
@e3krisztian e3krisztian marked this pull request as ready for review April 1, 2022 10:09
unblob/cli.py Outdated Show resolved Hide resolved
tests/test_cleanup.py Show resolved Hide resolved
tests/test_cleanup.py Show resolved Hide resolved
unblob/processing.py Outdated Show resolved Hide resolved
tests/test_handlers.py Outdated Show resolved Hide resolved
unblob/processing.py Outdated Show resolved Hide resolved
unblob/extractor.py Outdated Show resolved Hide resolved
unblob/processing.py Show resolved Hide resolved
unblob/processing.py Show resolved Hide resolved
unblob/processing.py Outdated Show resolved Hide resolved
unblob/extractor.py Outdated Show resolved Hide resolved
unblob/processing.py Show resolved Hide resolved
@e3krisztian e3krisztian force-pushed the eliminate_whole_file_chunks branch 2 times, most recently from dae1a1d to 4ccda47 Compare April 4, 2022 11:57
    This is done with the below script and looking through the resulting git change.

    $ cat recreate_outputs.sh
    #!/bin/bash

    set -e

    cd tests/integration

    find -name __output__ | xargs rm -rf

    find -name __input__ |
        while read input; do
            (
            cd "$input"
            mkdir ../__output__
            cd ../__output__
            find ../__input__ -type f |
                while read fw; do
                    unblob --keep-extracted-chunks "$fw"
                done
            )
        done

    # empty, but extracted directories
    find -type d | rg __output__ |
        while read dir; do
            rmdir --ignore-fail-on-non-empty "$dir"
            if [ ! -d "$dir" ]; then
                mkdir "$dir"
                touch "$dir/.gitkeep"
            fi
        done

    git add .
@e3krisztian e3krisztian merged commit 29a38f1 into main Apr 5, 2022
@e3krisztian e3krisztian deleted the eliminate_whole_file_chunks branch April 5, 2022 12:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Do not carve out chunks when the chunk covers the whole file
4 participants