Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SegmenterDefault.txt: more remapping, less renaming #970

Merged
merged 6 commits into from
Nov 25, 2024

Conversation

eggrobin
Copy link
Member

@eggrobin eggrobin commented Nov 19, 2024

Follow-up on #949, using remap rules wherever the UAXes do, and dropping now-useless names such as ZWJ_O, CM1, etc.

This technically removes all of the examples cited in UTC-155-A89 (as worded in SD2 « Document extra classes used for testing characters in the segmentation test HTML files for 11.0. [E.g. ZWJ_FE, CM1_CM, etc.] (Retargeted for 13.0, 14.0, 15.0.) », not as recorded in the minutes), but that action item should remain open until all nontrivial variable definitions are shown in the generated HTML files.

The MeowBreakTest files are hard to diff, but I have tried them in ICU: they are still correct.

@eggrobin eggrobin merged commit d1bdcb8 into unicode-org:main Nov 25, 2024
15 of 16 checks passed
eggrobin added a commit that referenced this pull request Nov 28, 2024
This fulfils the following action item:
UTC-155-A89 Robin Leroy, PAG Document extra classes used for testing characters in the segmentation test HTML files for 11.0. [E.g. ZWJ_FE, CM1_CM, etc.] (Retargeted for 13.0, 14.0, 15.0.)

It also fixes #354.

It also changes the pair table in LineBreakTest.html to show the three way direct/indirect/prohibited break distinction (across spaces), like the old pair table in UAX14 (see https://www.unicode.org/notes/tn54/alba-2.html?v=9.0.0).

As in #970, the test files are not diffable, but I tested them with ICU.

I tried improving the stability of the sample generation a little bit, but it is not as fancy as what Mark suggests in https://www.unicode.org/notes/tn54/alba-2.html?v=9.0.0#p478. I might do that in another PR.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants