Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

False positive when enum is used in iteration #304

Open
eltoder opened this issue Feb 23, 2023 · 7 comments · May be fixed by #312
Open

False positive when enum is used in iteration #304

eltoder opened this issue Feb 23, 2023 · 7 comments · May be fixed by #312

Comments

@eltoder
Copy link
Contributor

eltoder commented Feb 23, 2023

Enum classes allow iteration over the class, which returns all enum members. This is often convenient particularly when writing tests. Unfortunately, vulture does not understand this and reports enum members as unused. For example:

from enum import Enum

class E(Enum):
    A = 1
    B = 2

list(E)

produces these errors:

$ python -m vulture test_enum.py
test_enum.py:4: unused variable 'A' (60% confidence)
test_enum.py:5: unused variable 'B' (60% confidence)

This can be whitelisted, but it's a bit tedious when there are many enum members.

Some ideas on how this can be improved:

  1. It may be possible to detect some cases of iteration like list(E), tuple(E), a, b, c = E, for e in E and mark all members as used.
  2. For a work-around, adding a class-level comment or annotation that all members are used can make whitelisting much easier. This can help in other cases as well, for example this issue about dataclasses: Dataclass variables are tagged as unused variables #264.
  3. Maybe if the enum class is listed in __all__, all members can be considered used?
@eltoder eltoder changed the title False positive when enum is used by iteration False positive when enum is used in iteration Feb 23, 2023
@jendrikseipp
Copy link
Owner

Thanks for the report! All proposed solutions have in common that they need to find out which variables are defined in the scope of the class. Since Vulture currently has no notion of scope, they are tricky to implement. If you see a solution, I'm all ears :-)

@eltoder
Copy link
Contributor Author

eltoder commented Feb 23, 2023

@jendrikseipp That sounds like an even bigger problem :-) Adding scope information should improve the accuracy quite a bit. Python scoping rules are not very complicated.

@jendrikseipp
Copy link
Owner

There are some libraries that enrich the abstract syntax tree (on which Vulture operates) with scoping information, but I haven't looked deeper into them. I think that could be a good starting point.

@anudaweerasinghe
Copy link

Can I work on this?

@jendrikseipp
Copy link
Owner

Sure, happy to see progress here! It might be a good idea to discuss the high level approach though, before you spend time on implementing the details. If you have a high-level plan, I'm happy to discuss it.

@anudaweerasinghe
Copy link

@jendrikseipp looked into this a little more, and wanted to run the current implementation idea by you...

Progress thus far:

  • Identified that Vulture generates an AST of the provided source code in the scan function in core.py
  • Each node in the AST is then visited by the visit function
  • The visit function decides what helper to use based on the type of node
  • For this issue, we specifically care about the visit_Call and visit_ClassDef functions because these are invoked when the list() function is called, and when a class is defined with the Enum subclass.

Based on these findings, here is a high level approach of the solution:

  • Extend the ast with scoping information using the asttokens package.
  • In visit_Call, check if the node.func.id=='list' - are we calling the list function?
  • If so, get the argument(node.args[0].id) and use the scoping information provided by asttokens to check if the argument is a class
  • If so, use the scope info about the class to check if Enum is a superclass.
  • If so, then add all the fields defined in the class to defined_variables

@jendrikseipp
Copy link
Owner

Yes, your explanation makes sense and the approach is correct, I'd say. (In the last step, you probably mean "used_variables".)

The only thing I'm unsure about is the AST library. Have you compared asttokens with https://github.com/kavigupta/ast_scope and https://github.com/pylint-dev/astroid ? We need to pick one that does the job, is well-maintained and doesn't have too many dependencies.

@anudaweerasinghe anudaweerasinghe linked a pull request May 6, 2023 that will close this issue
4 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants