Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add a way to ignore the same files as .gitignore #344

Open
leblancfg opened this issue Dec 20, 2023 · 9 comments
Open

Add a way to ignore the same files as .gitignore #344

leblancfg opened this issue Dec 20, 2023 · 9 comments

Comments

@leblancfg
Copy link

leblancfg commented Dec 20, 2023

When using the tool, I need to manually generate an exclusion list parameter that's essentially recreating what my .gitignore is doing.

Otherwise I end up running the tool on my virtualenv, egg-info, tox, etc. and the initial user experience is that the tool is too slow even on tiny codebases. It would be much simpler to use a flag that tells vulture to just exclude the same files as in the .gitignore – a common enough resource in most repos.

One might even go one step further and make that the default – like more modern shell tools like ripgrep and fd, but that's likely a separate discussion.

@jendrikseipp
Copy link
Owner

Yes, that sounds reasonable. I'd be happy to review a pull request that uses .gitignore to exclude files if --exclude is missing from both pyproject.toml and the command line. I suggest to basically copy the feature implementation from black: psf/black#878 with the later patch psf/black#2170

whosayn added a commit to whosayn/vulture that referenced this issue Dec 22, 2023
patterns

This change updated vulture to use the exclude patterns from the
.gitignore file at the root of the input project in the absence of
user-provided inputs as a solution to jendrikseipp#344. Vulture now requires the
pathspec library to run.
whosayn added a commit to whosayn/vulture that referenced this issue Dec 22, 2023
patterns (jendrikseipp#344)

Use .gitignore to exclude files if --exclude is missing from both
pyproject.toml and the command line. Vulture now requires the
pathspec library to run.
jendrikseipp added a commit that referenced this issue Dec 23, 2023
…terns (#344) (#345)

* Read exclude patterns from .gitignore in absence of user-provided
patterns (#344)

Use .gitignore to exclude files if --exclude is missing from both
pyproject.toml and the command line. Vulture now requires the
pathspec library to run.

* Move dependencies to requirements.txt.

---------

Co-authored-by: Jendrik Seipp <jendrikseipp@gmail.com>
@jendrikseipp
Copy link
Owner

Fixed in #345 .

@leblancfg
Copy link
Author

leblancfg commented Dec 24, 2023

Wow, that was quick. Thanks for doing this, @whosayn and @jendrikseipp

jendrikseipp added a commit that referenced this issue Jan 6, 2024
…ided patterns (#344) (#345)"

This reverts commit aa6fcd2.

After further thought, the --exclude flag and .gitignore file
have different responsibilities and mixing them will surprise
some users.
@jendrikseipp
Copy link
Owner

After further thought, I decided to remove this feature again. The --exclude flag and .gitignore file have different responsibilities and mixing them will surprise some users and will make documentation and maintenance difficult. Also, I'd like Vulture to have no external dependencies, so adding one just for parsing .gitignore files is not ideal. Finally, the feature raised some questions in my mind: why do we only parse the top-level .gitignore file? Why do we parse the .gitignore file in the current directory and not in the directory passed to Vulture?

I should have put the brakes on this feature earlier. Sorry about that!

@leblancfg
Copy link
Author

leblancfg commented Jan 9, 2024

@jendrikseipp if you don't mind me pushing a bit and seeing if there's possible middle ground, I think I can see a future where these concerns are handled:

The --exclude flag and .gitignore file have different responsibilities and mixing them will surprise some users and will make documentation and maintenance difficult.

By your previous approval here, I was actually expecting this feature to only be merged for the next major version change, clearly indicating to users that there was a breaking change.

It could always be available behind a CLI flag though, and only be opt-in. If users enter both a --exclude and e.g. --use-gitignore flags, we emit a warning on STDERR, something like

Both --exclude and --gitignore flags set, only using --exclude patterns.

I'd like Vulture to have no external dependencies

I think vulture could easily vendor the pathspec library for this purpose. The actual package code is tiny, only a handful of files. IIUC given the pathspec license (Mozilla Public License), that only means adding attribution to the vulture docs if we don't modify pathspec code. Also, it is a mature package so there is less risk of bugs, etc.

Why do we parse the .gitignore file in the current directory and not in the directory passed to Vulture?

IMO this is the default with the other tools out there, and sanely handled in the way you describe. I had a local PR I was intending to push up before #345 merged that handled that, using the same logic as black's find_project_root function.


Would you mind if I revived my local branch and gave another stab at this feature, given the above? If so, would you agree with the following?

  1. We vendor the pathspec dependency
  2. With a CLI flag called --gitignore, users can opt-in to using it as a source of exclusion patterns
  3. If --exclude and --gitignore are given, only use exclude patterns.
  4. Copy black's find_project_root logic to handle the common parent .gitignore of the relative paths.

@jendrikseipp
Copy link
Owner

Thanks for the detailed proposal! Before we go further here, can you (or others) provide real-world examples of projects where the new feature would be beneficial? I.e., what are example .gitignore files in the wild where all contained gitignore patterns should also be ignored by Vulture?

@jendrikseipp jendrikseipp reopened this Jan 11, 2024
@leblancfg
Copy link
Author

leblancfg commented Jan 11, 2024

I see three main patterns that are encoded in .gitignore:

  1. "standard" locations for virtual environments, often .env/ and/or .venv/
  2. build/, *egg/ and dist/ folders
  3. testing artifacts (coverage, hypothesis, etc.)

I've found that these 5 popular repos have at least 2/3 these folders encoded in the .gitignores:

@jendrikseipp
Copy link
Owner

Thanks! But doesn't this suggest that it'd be better to ignore these directories by default, i.e., fill --exclude with these dirs by default?

However, ignoring these directories assumes that users call Vulture like vulture . in the project dir. Wouldn't it be much easier to have Vulture analyze the relevant files by passing them on the command line as in vulture src tests?

@leblancfg
Copy link
Author

But doesn't this suggest that it'd be better to ignore these directories by default, i.e., fill --exclude with these dirs by default?

I agree with this statement. It's also the default behaviour for many codebase-focused tools, like black, ripgrep, etc. but comes with "change management" maintenance work, like assigning it to the next major version, clear documentation, etc. Two thoughts:

  1. I prefer your previous suggestion to keep the --exclude flag as an override to the .gitignore patterns, just like black does. Users need a way to override default behaviour. Also limits the breaking changes for users that already have the flag set.
  2. Selfishly as a user, I'd rather have the option to use gitignore as a source of exclusion patterns behind an opt-in flag than not at all.

Ultimately your call here @jendrikseipp, and one we can hopefully discuss in an upcoming PR.


Wouldn't it be much easier to have Vulture analyze the relevant files by passing them on the command line as in vulture src tests?

I guess this line of discussion boils down to aesthetics. Gitignores also handle more complex patterns like "exclude setup/ except for setup/main.py" etc. You could suggest users use something like:

vulture $(git ls-files --exclude-standard --others)

Personally, that's not very elegant or user-friendly – I don't think it's fair to assume users know git this well. I find it annoying working with tools (e.g. pylint) that force you to be this verbose to recreate allow/exclude patterns that are already encoded in a file that serves exactly that purpose.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants