#653 Detection of multilanguage choices only works if first choice list is multilanguage #666

lindsay-stevens · 2023-11-14T03:30:21Z

Closes #653

Why is this the best possible solution? Were any other approaches considered?

There are 2 main problem areas addressed: the translation warning, and the detection of translations for triggering multi-language (itext) output behaviour,

This changes the missing translations check in xls2json to use column headers instead of the first row. To avoid re-iterating the sheet data, the existing dealias function is modified to track and return the headers as well as the sheet data.

This changes multi-language detection by doing so only in the Survey object. For the Question sub-types, Survey sets the relevant info with new fields, so that the Questions don't need to re-check. For choices, the detection is per itemset (it could have looked across all itemsets but then untranslated itemsets would've had to be mangled into translations).

What are the regression risks?

Since there are some new fields on the Survey and Question classes, external library users that pass in JSON without these may experience errors or inconsistent behaviour. There is a file json_form_schema.json but it seems to be not consistently updated, and I don't know if it's accurate - there doesn't seem to be a pyxform way to generate it or use it to validate.

Does this change require updates to documentation? If so, please file an issue here and include the link below.

Probably not, it's all internal representation.

Before submitting this PR, please make sure you have:

included test cases for core behavior and edge cases in tests
run nosetests and verified all tests pass
run black pyxform tests to format code
verified that any code or assets from external sources are properly credited in comments

- overall goal is to avoid using only the first few lines of survey or choices sheet data to decide whether to use multi-language behaviour (itext) or not. - xls2json.py: re-use dealias func for finding headers as well as data - translation_checks.py: use headers data to find translations - survey.py: - declare _translations key and store and translations info in it - tidy setup_translations to data prep vs. _translations update steps - add translation type key to help differentiate question vs choice - annotate survey elements with info about translations, dynamic choices, and media, so that this doesn't need to be re-checked later by individual elements. - decide whether to emit itext labels for choices on a per-itemset basis, rather than switching into itext mode for all choices. This avoids needing to mangle untranslated itemsets into translated when other itemsets in the survey are translated (otherwise they would get no itext, not even a hypen "-"). - add lru_cache upper limit so that it doesn't grow forever - update add_to_nested_dict to allow updating existing dict - question.py: - declare translations, dynamic choices, and media keys set by survey and use them to determine relevant behaviour instead of re-checking - utils.py: remove multi-language parameter from has_dynamic_label because it's clearer to do the same logic external to the function. - add tests for multi-language detection + with media

- For insert/pop on the left side: O(1) for deque vs. O(n) for list. - Avoids creating lots of new lists with [] + []. - Did not investigate whether the "flat" filter is still necessary.

- needed after rebase on latest master

lognaturel

it could have looked across all itemsets but then untranslated itemsets would've had to be mangled into translations

This is very interesting! We warn about this case and it's probably not terribly useful so either way would be fine.

Here's an interesting form I cooked up: https://docs.google.com/spreadsheets/d/1NW2yfyLCavbJ8reUo-6i1yjq48eIaEJ170ggDEzYB8Y/edit#gid=1033515766

I'm happy with the output and you may find it interesting to look at.

pyxform/survey.py

pyxform/utils.py

pyxform/survey_element.py

lognaturel · 2023-12-01T21:29:30Z

There's a small recommendation and a question about risk inline but I'm going to merge now so we can get it onto staging asap.

lindsay-stevens force-pushed the pyxform-653 branch 3 times, most recently from 391f161 to 3a008d1 Compare November 20, 2023 14:43

lindsay-stevens added 4 commits November 21, 2023 01:58

dev: fix/elaborate on type annotations

f353d54

chg: use more performant data type for finding element lineage

26026a6

- For insert/pop on the left side: O(1) for deque vs. O(n) for list. - Avoids creating lots of new lists with [] + []. - Did not investigate whether the "flat" filter is still necessary.

fix: return type change for dealias_and_group_headers for entities sheet

05ad654

- needed after rebase on latest master

lindsay-stevens force-pushed the pyxform-653 branch from 3a008d1 to 05ad654 Compare November 20, 2023 15:01

lindsay-stevens requested a review from lognaturel November 20, 2023 15:14

lindsay-stevens marked this pull request as ready for review November 24, 2023 07:13

lognaturel approved these changes Dec 1, 2023

View reviewed changes

pyxform/survey.py Show resolved Hide resolved

pyxform/utils.py Show resolved Hide resolved

pyxform/survey_element.py Show resolved Hide resolved

lognaturel merged commit ee6bd2d into XLSForm:master Dec 1, 2023
10 checks passed

lindsay-stevens deleted the pyxform-653 branch December 4, 2023 11:46

lindsay-stevens mentioned this pull request Dec 4, 2023

653: tidy up is_label_dynamic to return condition directly #675

Merged

4 tasks

lindsay-stevens mentioned this pull request Jan 10, 2024

Translation checks only run on columns used by first data row #637

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

#653 Detection of multilanguage choices only works if first choice list is multilanguage #666

#653 Detection of multilanguage choices only works if first choice list is multilanguage #666

lindsay-stevens commented Nov 14, 2023 •

edited

Loading

lognaturel left a comment

lognaturel commented Dec 1, 2023

#653 Detection of multilanguage choices only works if first choice list is multilanguage #666

#653 Detection of multilanguage choices only works if first choice list is multilanguage #666

Conversation

lindsay-stevens commented Nov 14, 2023 • edited Loading

Why is this the best possible solution? Were any other approaches considered?

What are the regression risks?

Does this change require updates to documentation? If so, please file an issue here and include the link below.

Before submitting this PR, please make sure you have:

lognaturel left a comment

Choose a reason for hiding this comment

lognaturel commented Dec 1, 2023

lindsay-stevens commented Nov 14, 2023 •

edited

Loading