GitHub - cryham/py-dirt: Python command line tool to find and delete duplicated files. Also to rename with added rating prefix. Has few options e.g. across subdirs, using hash, stats etc.

About

Python command line tool for finding and deleting duplicated files.
Also renaming with added rating symbols, from count of duplicates.
Name comes from: Python Directory Tree.

How it works and Features

Options (-l -d etc.) are listed below for features.
This tool:

Lists files (use -l to show) in current directory (or specified after -d).
By default recursively. Use -n to not check subdirs.
Gets for each file: its name, extension, size, and reads part of file to get hash value.
By default size is 4kB from end (-4096).
Use e.g. -s 1024 for 1k from start. Or -s 1 to read full files (slow).
Files are unique when at least one of their properties are:
cleaned file name, extension, file size or hash value.
In d.py after def unique_attrs
It is required for same files to have (number) in name, name before (number) must be the same.
Also whole file name is cleaned from any characters after (number) and any rating prefix rating chars (e.g. added in previous run).
For example (just for names) all here will be treated as same: a, a(1), a(2), `a(2)^, !a(2)abc
By default it checks for unique files across subdirs (use -a not to, i.e. only in each dir).
After that it writes Stats, showing Dirs and Files counts (left and all) and Size reduction.
E.g. Files: 33% 1 / 3 would mean that of all 3 files 1 is left and that's 33%.
Same goes for Size:. It's in Bytes and sepearated with g m k (giga mega kilo, 1000 multipliers).
In d.py after find (n) in name
If used debug or test -t it will end now.
If not, it starts Executing, i.e. deleting duplicated files and renaming those left with rating.
If no execute -x is used it will show same output but won't execute.
Rating can be added as prefix (default, use -p to disable) at begin of filename,
increasing with duplicated files count ` _ ^ - , + ) ( ! !!
In d.py after ratings =
Or as a numeric suffix at end of filename, from _1 to _9 and above (use with -u).
Rating can be offset by a value with -i value.
If new filename already exists, a suffix letter (a to z) will be added to filename.

List format (header):
file size | # or - (unique or deleted) | file name
List format after Executing:
# or - (unique or deleted) | duplicates count | new file name

Testing and Examples

For testing included examples (in test-dirs/)
set debug = True in d.py (and comment out #debug = False) to test how the tool works.
Uncomment (no #) just one of lines with start_dir += for a test.

Those test examples have README.md files inside subdirs, explaining what happens with tool output:
1simple, 2same-name, 3two-dir, 4clean-name

Usage

Start ./d.py -h for help on options.

Use ./d.py -t first to test dir and show Stats (nothing executed). With -l it will also list files.

Using -o will only show final options (Yes/No).

All options also have longer names e.g. --list.
To change any option's default value change last 0 or 1 to opposite at end of line e.g. with opt_bool('-t',

Name		Name	Last commit message	Last commit date
Latest commit History 34 Commits
.vscode		.vscode
test-dirs		test-dirs
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
d.py		d.py
utils.py		utils.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

About

How it works and Features

Testing and Examples

Usage

About

Releases

Packages

Languages

License

cryham/py-dirt

Folders and files

Latest commit

History

Repository files navigation

About

How it works and Features

Testing and Examples

Usage

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages