Skip to content

Commit

Permalink
12 IPA diacritics, 11 above and 1 below (#742)
Browse files Browse the repository at this point in the history
* UnicodeData.txt lines from proposal

* LineBreak.txt as described in the proposal

* Inherited

* diacritics are diacritics.

* Regenerate UCD

* en-GB-oxendict

* Regenerate UCD

* Regenerate UCD
  • Loading branch information
eggrobin authored Nov 13, 2024
1 parent 8f51864 commit 2fb2e8b
Show file tree
Hide file tree
Showing 19 changed files with 121 additions and 47 deletions.
5 changes: 3 additions & 2 deletions unicodetools/data/ucd/dev/DerivedAge.txt
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
# DerivedAge-17.0.0.txt
# Date: 2024-11-13, 21:07:25 GMT
# Date: 2024-11-13, 21:22:30 GMT
# © 2024 Unicode®, Inc.
# Unicode and the Unicode Logo are registered trademarks of Unicode, Inc. in the U.S. and other countries.
# For terms of use and license, see https://www.unicode.org/terms_of_use.html
Expand Down Expand Up @@ -2071,7 +2071,8 @@ A7DA..A7DC ; 16.0 # [3] LATIN CAPITAL LETTER LAMBDA..LATIN CAPITAL LETTER L
0C5C ; 17.0 # TELUGU ARCHAIC SHRII
0CDC ; 17.0 # KANNADA ARCHAIC SHRII
1ACF..1ADD ; 17.0 # [15] COMBINING DOUBLE CARON..COMBINING DOT-AND-RING BELOW
1AE0..1AEB ; 17.0 # [12] COMBINING LEFT TACK ABOVE..COMBINING DOUBLE RIGHTWARDS ARROW ABOVE

# Total code points: 21
# Total code points: 33

# EOF
17 changes: 11 additions & 6 deletions unicodetools/data/ucd/dev/DerivedCoreProperties.txt
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
# DerivedCoreProperties-17.0.0.txt
# Date: 2024-11-13, 21:07:43 GMT
# Date: 2024-11-13, 21:22:48 GMT
# © 2024 Unicode®, Inc.
# Unicode and the Unicode Logo are registered trademarks of Unicode, Inc. in the U.S. and other countries.
# For terms of use and license, see https://www.unicode.org/terms_of_use.html
Expand Down Expand Up @@ -3196,6 +3196,7 @@ FF41..FF5A ; Cased # L& [26] FULLWIDTH LATIN SMALL LETTER A..FULLWIDTH LATIN
1AB0..1ABD ; Case_Ignorable # Mn [14] COMBINING DOUBLED CIRCUMFLEX ACCENT..COMBINING PARENTHESES BELOW
1ABE ; Case_Ignorable # Me COMBINING PARENTHESES OVERLAY
1ABF..1ADD ; Case_Ignorable # Mn [31] COMBINING LATIN SMALL LETTER W BELOW..COMBINING DOT-AND-RING BELOW
1AE0..1AEB ; Case_Ignorable # Mn [12] COMBINING LEFT TACK ABOVE..COMBINING DOUBLE RIGHTWARDS ARROW ABOVE
1B00..1B03 ; Case_Ignorable # Mn [4] BALINESE SIGN ULU RICEM..BALINESE SIGN SURANG
1B34 ; Case_Ignorable # Mn BALINESE SIGN REREKAN
1B36..1B3A ; Case_Ignorable # Mn [5] BALINESE VOWEL SIGN ULU..BALINESE VOWEL SIGN RA REPA
Expand Down Expand Up @@ -3506,7 +3507,7 @@ E0001 ; Case_Ignorable # Cf LANGUAGE TAG
E0020..E007F ; Case_Ignorable # Cf [96] TAG SPACE..CANCEL TAG
E0100..E01EF ; Case_Ignorable # Mn [240] VARIATION SELECTOR-17..VARIATION SELECTOR-256

# Total code points: 2766
# Total code points: 2778

# ================================================

Expand Down Expand Up @@ -7461,6 +7462,7 @@ FFDA..FFDC ; ID_Start # Lo [3] HALFWIDTH HANGUL LETTER EU..HALFWIDTH HANGUL
1AA7 ; ID_Continue # Lm TAI THAM SIGN MAI YAMOK
1AB0..1ABD ; ID_Continue # Mn [14] COMBINING DOUBLED CIRCUMFLEX ACCENT..COMBINING PARENTHESES BELOW
1ABF..1ADD ; ID_Continue # Mn [31] COMBINING LATIN SMALL LETTER W BELOW..COMBINING DOT-AND-RING BELOW
1AE0..1AEB ; ID_Continue # Mn [12] COMBINING LEFT TACK ABOVE..COMBINING DOUBLE RIGHTWARDS ARROW ABOVE
1B00..1B03 ; ID_Continue # Mn [4] BALINESE SIGN ULU RICEM..BALINESE SIGN SURANG
1B04 ; ID_Continue # Mc BALINESE SIGN BISAH
1B05..1B33 ; ID_Continue # Lo [47] BALINESE LETTER AKARA..BALINESE LETTER HA
Expand Down Expand Up @@ -8373,7 +8375,7 @@ FFDA..FFDC ; ID_Continue # Lo [3] HALFWIDTH HANGUL LETTER EU..HALFWIDTH HAN
31350..323AF ; ID_Continue # Lo [4192] CJK UNIFIED IDEOGRAPH-31350..CJK UNIFIED IDEOGRAPH-323AF
E0100..E01EF ; ID_Continue # Mn [240] VARIATION SELECTOR-17..VARIATION SELECTOR-256

# Total code points: 144562
# Total code points: 144574

# ================================================

Expand Down Expand Up @@ -9645,6 +9647,7 @@ FFDA..FFDC ; XID_Start # Lo [3] HALFWIDTH HANGUL LETTER EU..HALFWIDTH HANGU
1AA7 ; XID_Continue # Lm TAI THAM SIGN MAI YAMOK
1AB0..1ABD ; XID_Continue # Mn [14] COMBINING DOUBLED CIRCUMFLEX ACCENT..COMBINING PARENTHESES BELOW
1ABF..1ADD ; XID_Continue # Mn [31] COMBINING LATIN SMALL LETTER W BELOW..COMBINING DOT-AND-RING BELOW
1AE0..1AEB ; XID_Continue # Mn [12] COMBINING LEFT TACK ABOVE..COMBINING DOUBLE RIGHTWARDS ARROW ABOVE
1B00..1B03 ; XID_Continue # Mn [4] BALINESE SIGN ULU RICEM..BALINESE SIGN SURANG
1B04 ; XID_Continue # Mc BALINESE SIGN BISAH
1B05..1B33 ; XID_Continue # Lo [47] BALINESE LETTER AKARA..BALINESE LETTER HA
Expand Down Expand Up @@ -10562,7 +10565,7 @@ FFDA..FFDC ; XID_Continue # Lo [3] HALFWIDTH HANGUL LETTER EU..HALFWIDTH HA
31350..323AF ; XID_Continue # Lo [4192] CJK UNIFIED IDEOGRAPH-31350..CJK UNIFIED IDEOGRAPH-323AF
E0100..E01EF ; XID_Continue # Mn [240] VARIATION SELECTOR-17..VARIATION SELECTOR-256

# Total code points: 144543
# Total code points: 144555

# ================================================

Expand Down Expand Up @@ -10784,6 +10787,7 @@ E01F0..E0FFF ; Default_Ignorable_Code_Point # Cn [3600] <reserved-E01F0>..<rese
1AB0..1ABD ; Grapheme_Extend # Mn [14] COMBINING DOUBLED CIRCUMFLEX ACCENT..COMBINING PARENTHESES BELOW
1ABE ; Grapheme_Extend # Me COMBINING PARENTHESES OVERLAY
1ABF..1ADD ; Grapheme_Extend # Mn [31] COMBINING LATIN SMALL LETTER W BELOW..COMBINING DOT-AND-RING BELOW
1AE0..1AEB ; Grapheme_Extend # Mn [12] COMBINING LEFT TACK ABOVE..COMBINING DOUBLE RIGHTWARDS ARROW ABOVE
1B00..1B03 ; Grapheme_Extend # Mn [4] BALINESE SIGN ULU RICEM..BALINESE SIGN SURANG
1B34 ; Grapheme_Extend # Mn BALINESE SIGN REREKAN
1B35 ; Grapheme_Extend # Mc BALINESE VOWEL SIGN TEDUNG
Expand Down Expand Up @@ -11034,7 +11038,7 @@ FF9E..FF9F ; Grapheme_Extend # Lm [2] HALFWIDTH KATAKANA VOICED SOUND MARK.
E0020..E007F ; Grapheme_Extend # Cf [96] TAG SPACE..CANCEL TAG
E0100..E01EF ; Grapheme_Extend # Mn [240] VARIATION SELECTOR-17..VARIATION SELECTOR-256

# Total code points: 2210
# Total code points: 2222

# ================================================

Expand Down Expand Up @@ -13113,6 +13117,7 @@ ABED ; Grapheme_Link # Mn MEETEI MAYEK APUN IYEK
1AB0..1ABD ; InCB; Extend # Mn [14] COMBINING DOUBLED CIRCUMFLEX ACCENT..COMBINING PARENTHESES BELOW
1ABE ; InCB; Extend # Me COMBINING PARENTHESES OVERLAY
1ABF..1ADD ; InCB; Extend # Mn [31] COMBINING LATIN SMALL LETTER W BELOW..COMBINING DOT-AND-RING BELOW
1AE0..1AEB ; InCB; Extend # Mn [12] COMBINING LEFT TACK ABOVE..COMBINING DOUBLE RIGHTWARDS ARROW ABOVE
1B00..1B03 ; InCB; Extend # Mn [4] BALINESE SIGN ULU RICEM..BALINESE SIGN SURANG
1B34 ; InCB; Extend # Mn BALINESE SIGN REREKAN
1B35 ; InCB; Extend # Mc BALINESE VOWEL SIGN TEDUNG
Expand Down Expand Up @@ -13364,6 +13369,6 @@ FF9E..FF9F ; InCB; Extend # Lm [2] HALFWIDTH KATAKANA VOICED SOUND MARK..HA
E0020..E007F ; InCB; Extend # Cf [96] TAG SPACE..CANCEL TAG
E0100..E01EF ; InCB; Extend # Mn [240] VARIATION SELECTOR-17..VARIATION SELECTOR-256

# Total code points: 2209
# Total code points: 2221

# EOF
3 changes: 2 additions & 1 deletion unicodetools/data/ucd/dev/EastAsianWidth.txt
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
# EastAsianWidth-17.0.0.txt
# Date: 2024-11-13, 21:07:48 GMT
# Date: 2024-11-13, 21:22:52 GMT
# © 2024 Unicode®, Inc.
# Unicode and the Unicode Logo are registered trademarks of Unicode, Inc. in the U.S. and other countries.
# For terms of use and license, see https://www.unicode.org/terms_of_use.html
Expand Down Expand Up @@ -808,6 +808,7 @@
1AB0..1ABD ; N # Mn [14] COMBINING DOUBLED CIRCUMFLEX ACCENT..COMBINING PARENTHESES BELOW
1ABE ; N # Me COMBINING PARENTHESES OVERLAY
1ABF..1ADD ; N # Mn [31] COMBINING LATIN SMALL LETTER W BELOW..COMBINING DOT-AND-RING BELOW
1AE0..1AEB ; N # Mn [12] COMBINING LEFT TACK ABOVE..COMBINING DOUBLE RIGHTWARDS ARROW ABOVE
1B00..1B03 ; N # Mn [4] BALINESE SIGN ULU RICEM..BALINESE SIGN SURANG
1B04 ; N # Mc BALINESE SIGN BISAH
1B05..1B33 ; N # Lo [47] BALINESE LETTER AKARA..BALINESE LETTER HA
Expand Down
4 changes: 3 additions & 1 deletion unicodetools/data/ucd/dev/LineBreak.txt
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
# LineBreak-17.0.0.txt
# Date: 2024-11-13, 21:07:49 GMT
# Date: 2024-11-13, 21:22:53 GMT
# © 2024 Unicode®, Inc.
# Unicode and the Unicode Logo are registered trademarks of Unicode, Inc. in the U.S. and other countries.
# For terms of use and license, see https://www.unicode.org/terms_of_use.html
Expand Down Expand Up @@ -778,6 +778,8 @@
1AB0..1ABD ; CM # Mn [14] COMBINING DOUBLED CIRCUMFLEX ACCENT..COMBINING PARENTHESES BELOW
1ABE ; CM # Me COMBINING PARENTHESES OVERLAY
1ABF..1ADD ; CM # Mn [31] COMBINING LATIN SMALL LETTER W BELOW..COMBINING DOT-AND-RING BELOW
1AE0..1AEA ; CM # Mn [11] COMBINING LEFT TACK ABOVE..COMBINING UPWARDS ARROW ABOVE
1AEB ; GL # Mn COMBINING DOUBLE RIGHTWARDS ARROW ABOVE
1B00..1B03 ; CM # Mn [4] BALINESE SIGN ULU RICEM..BALINESE SIGN SURANG
1B04 ; CM # Mc BALINESE SIGN BISAH
1B05..1B33 ; AK # Lo [47] BALINESE LETTER AKARA..BALINESE LETTER HA
Expand Down
Loading

0 comments on commit 2fb2e8b

Please sign in to comment.