Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

pybabel extract command from CLI only respects the first argument passed into keywords #1067

Open
ankitd33 opened this issue Mar 14, 2024 · 3 comments · May be fixed by #1157
Open

pybabel extract command from CLI only respects the first argument passed into keywords #1067

ankitd33 opened this issue Mar 14, 2024 · 3 comments · May be fixed by #1157

Comments

@ankitd33
Copy link

Overview Description

When running pybabel extract

pybabel extract -F CONFIG_FILEPATH -o POT_FILEPATH REPO_T_CHECK --keywords=translate:1 --keywords=translate:1,2 -c TRANSLATORS --no-wrap --no-default-keywords

it only extracts strings in the first input in translate and not both the first input and ones where it has two inputs to treat them as plurals

Both the other commands (below) work perfectly and ideally when I run the above command I want a superset of the two with the second keywords overwriting the first if the same msgid shows up in that one

pybabel extract -F CONFIG_FILEPATH -o POT_FILEPATH REPO_T_CHECK --keywords=translate:1,2 -c TRANSLATORS --no-wrap --no-default-keywords

pybabel extract -F CONFIG_FILEPATH -o POT_FILEPATH REPO_T_CHECK --keywords=translate:1 -c TRANSLATORS --no-wrap --no-default-keywords

Steps to Reproduce

Run pybabel extract with two keywords, one to extract normal strings and one to extract strings and plurals

Actual Results

Essentially

pybabel extract -F CONFIG_FILEPATH -o POT_FILEPATH REPO_T_CHECK --keywords=translate:1 --keywords=translate:1,2 -c TRANSLATORS --no-wrap --no-default-keywords

does the same as running

pybabel extract -F CONFIG_FILEPATH -o POT_FILEPATH REPO_T_CHECK --keywords=translate:1 -c TRANSLATORS --no-wrap --no-default-keywords

Expected Results

Reproducibility

always

Additional Information

@EmilyBStudent
Copy link

I've been looking into this issue. It appears that it only occurs when multiple keywords have the same function name and the functions aren't differentiated by using a 't' argument. For instance, say that your input data is:

msg1 = translate("bunny", "bunnies", len(bunnies))
msg2 = translate('follow')

You will get the desired results if you run pybabel extract with
--keywords=translate:1,1t --keywords=translate:1,2,3t
instead of
--keywords=translate:1 --keywords=translate:1,2

The keywords data structure isn't currently set up to allow multiple keywords with the same function name unless they are differentiated with a 't' argument. It could probably be extended to allow for this. Or would it be better to detect duplicate keywords like this and give an error/warning prompting the user to add 't' arguments?

@tomasr8
Copy link
Member

tomasr8 commented Nov 17, 2024

FWIW xgettext allows it without raising any warnings:

xgettext -o - --keyword=translate:1 --keyword=translate:1,2 test.py

Based on that, I think we should support it as well

EmilyBStudent added a commit to EmilyBStudent/babel that referenced this issue Nov 28, 2024
Extend keywords dict to support multiple keywords with same name/arity

Fixes python-babel#1067
EmilyBStudent added a commit to EmilyBStudent/babel that referenced this issue Nov 28, 2024
Extend keywords dict to support multiple keywords with same name/arity

Fixes python-babel#1067
@EmilyBStudent
Copy link

EmilyBStudent commented Dec 3, 2024

I've been working on this issue and have it working, while maintaining backwards compatibility with the previous keywords dictionary format. To allow for keywords with multiple specs that aren't distinguished with a 't' argument, the keyword dictionary needs to be extended to allow for a collection of specs as well as just a single spec per number of arguments, e.g.

keywords = {
    '_': ((1,), (1, 2))
}

For backwards compatibility, I have the code only generate a collection containing multiple specs if there are multiple specs it needs to store. Otherwise it generates a keyword dict in the same format as previously, with the spec stored directly as the dictionary value (and all the existing unit tests pass without changes on my machine so this appears to be working).

Currently I'm using tuples to contain the collection of relevant specs, but since specs are also represented as tuples, that's causing significant inelegancies in distinguishing a spec tuple from a tuple containing or potentially containing multiple specs. Would it be preferable to use a list instead?

Sorry to ask this after opening the pull request!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants