-
Notifications
You must be signed in to change notification settings - Fork 8
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Extension for secure utf-8 identifiers, for everybody seeing identifiable names in UTF-8 encodings, like a filename in a terminal or UI widget. Identifiers need to be identifiable, i.e. implement mixed script detection, and such for a Unicode TR39 Moderately Restrictive restriction level. Also identifiers are validated and normalized by default, to be able to compare and find them.
- Loading branch information
Showing
7 changed files
with
68 additions
and
3 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,46 @@ | ||
/* POSIX std extension for people using utf-8 identifiers, but | ||
need security. See http://unicode.org/reports/tr39/ | ||
Like a kernel filesystem or user database, in a UTF-8 terminal, | ||
wishes to present identifiers, like names, paths or files identifiable. | ||
I.e. normalized and with identifiable characters only. Most don't display | ||
names as puny-code. | ||
Implement the Moderately Restrictive restriction level as default. | ||
* All characters in the string are in the ASCII range, or | ||
* The string is single-script, according to the definition in Section 5.1, or | ||
* The string is covered by any of the following sets of scripts, according to | ||
the definition in TR29 Section 5.1: | ||
Latin + Han + Hiragana + Katakana; or equivalently: Latn + Jpan | ||
Latin + Han + Bopomofo; or equivalently: Latn + Hanb | ||
Latin + Han + Hangul; or equivalently: Latn + Kore, or | ||
* The string is covered by Latin and any one other Recommended script, except Cyrillic, Greek. | ||
* The string must be validated UTF-8 and normalized, and only consist of valid identifier | ||
characters. | ||
Reject violations, optionally warn about confusables. | ||
SPDX-License-Identifier: MIT */ | ||
|
||
#ifndef __CTL_U8IDENT_H__ | ||
#define __CTL_U8IDENT_H__ | ||
|
||
#ifdef T | ||
#error "Template type T defined for <ctl/u8ident.h>" | ||
#endif | ||
|
||
#define HOLD | ||
#define u8id_char8_t u8id | ||
#define vec u8id | ||
#define A u8id | ||
#include <ctl/u8string.h> | ||
|
||
// TODO Take my code from cperl, which has stable unicode security for some years. | ||
// I'm also just adding this to my safeclib. | ||
// The only other existing example of proper unicode security is Java. | ||
|
||
#undef A | ||
#undef I | ||
#undef T | ||
#undef POD | ||
#undef HOLD | ||
|
||
#endif |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters