-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Tidy First] Use isort
to order/format/de-dupe python imports
#10085
Conversation
The tool `isort` reorders imports to be in alphabetical order. I've added it because our imports in most files are in random order. The lack of order meant that: - sometimes the same module would be imported from twice - figuring out if a a module was already being imported from took longer In the next commit I'll actually run isort to order everything. The best part is that when developing, we don't have to put them in correct order. Though you can if you want. However, `isort` will take care of re-ordering things at commit time :)
Specifically we set two config values: `extend_skip_glob` and `known_first_party`. The `extend_skip_glob` extends the default skipped paths. The defaults can be seen here https://pycqa.github.io/isort/docs/configuration/options.html#skip. We are skipping third party stubs because these are more so provided (I believe). We are skipping `.github` and `scripts` as they feel out of scope and things we can be less strict with. The `known_first_party` setting makes it so that these imports get grouped separately from all other imports, which is useful visually of "this comes from us" vs "this comes from someone/somewhere else".
I was seeing some odd behavior where running pre-commit, adding the modified files, and then running pre-commit again would result making more modifications to some of the same files. This felt odd. You shouldn't have to run pre-commit more multiple times for it to eventually come to a final "solution". I believe the problem was because we are using the tool `black` to format things, but weren't registering the black profile with `isort` this lead to some conflicting formatting rules, and the two tools had to negotiate a few times before being both satisfied. Registering the profile `black` with `isort` resolved this problem.
This was done by running `pre-commit run --all`. I ran it separately from the commit process itself because I wanted to run it against all files instead of only changed files. Of note, this not only reordered and formatted our imports. But we also had 60 distinct duplicate module import paths across 50 files, which this took care of. When I say "distinct duplicate module import paths" I mean when `from x.y.z import` was imported more than once in a single file.
I've added the |
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #10085 +/- ##
==========================================
- Coverage 88.20% 88.20% -0.01%
==========================================
Files 181 181
Lines 22722 22748 +26
==========================================
+ Hits 20042 20064 +22
- Misses 2680 2684 +4
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. |
I think it's debatable if a changie doc is needed for this. We've added a dependency, but it's only a dev dependency. We've made changes in the source code that we distribute, but it's just the imports that are touched. Seems almost less than an |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🚀
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will pre-commit
acknowledge these settings? I thought they would need to be in pre-commit-config.yaml
as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That is what I originally thought as well. I put the settings in the pre-commit-config.yaml
and they were ignored. The isort configuration doc page lists many ways to set configs (notably not pre-commit). I did some testing with the .isort.cfg
and it immediately worked respecting the options.
resolves N/A
Problem
Our imports haven't been ordered ever, probably because doing so manually can be annoying (especially when things are already a mess). However having them out of order actually slows us (at least me) down because when we want to import something its harder to visually figure out if the necessary module is already being imported from or not.
This disorganization of imports caused instances where we were importing from the same module multiple times in the same file by accident. I wrote a short script to figure out how many distinct duplicate module import paths we had (this is when an
from x.y.z import
occurs more than once in a single file). The script found 60 instances across 50 files 🙃 One example of this for a visual representation is thatfrom dbt.artifacts.resources import
appeared twice in unparsed.py (Exhibit A, Exhibit B). In another file there was a module path that was imported from 3 separate times...Solution
We've been using isort in other repositories such as metricflow and dbt-semantic-intefaces for this exact problem. So I've gone and added
isort
here as well, and it'll be invoked in pre-commit hooks.Checklist