-
Notifications
You must be signed in to change notification settings - Fork 26
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
get-gtdb refactor #169
get-gtdb refactor #169
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey @mikerobeson looks good, thanks! I pulled down and tested and it works like a charm 😉 . I just have some super minor questions/comments on the code inline.
'207': {'Archaea': 'ar53', 'Bacteria': 'bac120'}, | ||
'202': {'Archaea': 'ar122', 'Bacteria': 'bac120'}} | ||
|
||
VERSION_MAP_DICT = {'214.1': {'Archaea': 'ar53', 'Bacteria': 'bac120'}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
theoretical question: should we keep the exact same nomenclature as used by GTDB? Or introduce the decimals for all versions as you have done here? I don't have an opinion on this but just pausing for a moment's thought. Would any users be confused that we support a version 207.0 but this does not exist in GTDB (strictly speaking)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was pondering this myself. The issue is that the http paths all contain a folder with a decimal:
https://data.gtdb.ecogenomic.org/releases/releases/release202/202.0/
https://data.gtdb.ecogenomic.org/releases/releases/release207/207.0/
https://data.gtdb.ecogenomic.org/releases/releases/release214/214.0/
https://data.gtdb.ecogenomic.org/releases/releases/release214/214.1/
At the time I figured I'd just use that folder label and then parse out the base version, e.g. 214 from 214.1 rather than split off the decimal and rebuild it for the http string.
All I'd need to do is check for a decimal within the version
string e.g. 214.1 and then use that for the folder. Then append a .0
to those versions that do not have a a decimal, e.g. 207, etc...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ah makes sense... let's use the decimals then.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The PR should be good to go then. :-)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🙌
Fixes #167 and #168.
This refactor:
To do: