Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Add recursive sudirectories and files extraction. #585

Merged
merged 5 commits into from
May 23, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
18 changes: 17 additions & 1 deletion docs/api.rst
Original file line number Diff line number Diff line change
Expand Up @@ -186,7 +186,7 @@ SevenZipFile Object
py7zr looks for files and directories as same as specified in element
of 'targets'.

When the method get a ``str`` object or another object other than collection
When the method gets a ``str`` object or another object other than collection
such as LIST or SET, it will raise :exc:`TypeError`.

Once extract() called, the ``SevenZipFile`` object become exhausted,
Expand All @@ -199,6 +199,20 @@ SevenZipFile Object
'somedir/somefile' then pass a list: ['somedirectory', 'somedir/somefile']
as a target argument.


.. py:method:: SevenZipFile.extract(path=None, targets=None, recursive=True)

'recursive' is a BOOLEAN which if set True, helps with simplifying subcontents
extraction.

Instead of specifying all files / directories under a parent
directory by passing a list of 'targets', specifying only the parent directory
and setting 'recursive' to True forces an automatic extraction of all
subdirectories and subcontents recursively.

If 'recursive' is not set, it defaults to False, so the extraction proceeds as
if the parameter did not exist.

Please see 'tests/test_basic.py: test_py7zr_extract_and_getnames()' for
example code.

Expand All @@ -210,6 +224,8 @@ SevenZipFile Object
targets = [f if filter_pattern.match(f) for f in allfiles]
with SevenZipFile('archive.7z', 'r') as zip:
zip.extract(targets=targets)
with SevenZipFile('archive.7z', 'r') as zip:
zip.extract(targets=targets, recursive=True)


.. py:method:: SevenZipFile.readall()
Expand Down
1 change: 1 addition & 0 deletions docs/authors.rst
Original file line number Diff line number Diff line change
Expand Up @@ -23,6 +23,7 @@ Contributors, listed alphabetically, are:
* Kyle Altendorf -- Fix multithreading problem (#82)
* Martin Larralde -- Fix writef method (#397)
* Megan Leet -- Fix infinite loop when extraction (#354)
* Nikolaos Episkopos -- Add recursive subdirectories extraction (#585)
* @padremayi -- Fix crash on wrong crationtime in archive (#275)
* @royopa -- Fix typo (#108)
* Sergei -- Update report_update() (#558)
Expand Down
18 changes: 13 additions & 5 deletions py7zr/py7zr.py
Original file line number Diff line number Diff line change
Expand Up @@ -531,6 +531,7 @@ def _extract(
targets: Optional[Collection[str]] = None,
return_dict: bool = False,
callback: Optional[ExtractCallback] = None,
recursive: Optional[bool] = False,
) -> Optional[Dict[str, IO[Any]]]:
if callback is None:
pass
Expand Down Expand Up @@ -560,9 +561,14 @@ def _extract(
fnames: Dict[str, int] = {} # check duplicated filename in one archive?
self.q.put(("pre", None, None))
for f in self.files:
if targets is not None and f.filename not in targets:
self.worker.register_filelike(f.id, None)
continue
if targets is not None and recursive is False:
if f.filename not in targets:
self.worker.register_filelike(f.id, None)
continue
elif targets is not None and recursive is True:
if f.filename not in targets and not any([f.filename.startswith(target) for target in targets]):
self.worker.register_filelike(f.id, None)
continue

# When archive has a multiple files which have same name
# To guarantee order of archive, multi-thread decompression becomes off.
Expand Down Expand Up @@ -997,10 +1003,12 @@ def read(self, targets: Optional[Collection[str]] = None) -> Optional[Dict[str,
self._dict = {}
return self._extract(path=None, targets=targets, return_dict=True)

def extract(self, path: Optional[Any] = None, targets: Optional[Collection[str]] = None) -> None:
def extract(
self, path: Optional[Any] = None, targets: Optional[Collection[str]] = None, recursive: Optional[bool] = False
) -> None:
if not self._is_none_or_collection(targets):
raise TypeError("Wrong argument type given.")
self._extract(path, targets, return_dict=False)
self._extract(path, targets, return_dict=False, recursive=recursive)

def reporter(self, callback: ExtractCallback):
while True:
Expand Down
Loading