Skip to content

Commit

Permalink
Merge pull request #630 from hed-standard/develop
Browse files Browse the repository at this point in the history
Exclude C1 control characters from nonascii
  • Loading branch information
VisLab authored Oct 9, 2024
2 parents faa99e7 + 0ff5d35 commit 6b7b64e
Show file tree
Hide file tree
Showing 9 changed files with 86 additions and 60 deletions.
86 changes: 43 additions & 43 deletions docs/source/02_Terminology.md
Original file line number Diff line number Diff line change
Expand Up @@ -142,46 +142,46 @@ is fixed or noted.
Starting with HED standard schema versions 8.3.0 and above, HED will allow UTF-8 characters in various settings.
The types of characters referred to in this specification are:

| Name | Description |
|-----------------| ----------- |
| `alphanumeric` | `letters` and/or `digits` |
| `ampersand` | ASCII code 38 |
| `ascii` | utf-8 codes 0 to 127 (single byte) |
| `asterisk` | ASCII code 42 |
| `at-sign` | ASCII code 64 |
| `backslash` | ASCII code 92 |
| `blank` | ASCII code 32 |
| `caret` | ASCII code 94 |
| `colon` | ASCII code 58 |
| `comma` | ASCII code 44 |
| `dollar` | ASCII code 36 |
| `digits` | 0-9 |
| `double-quote` | ASCII code 34 |
| `equals` | ASCII code 61 |
| `exclamation` | ASCII code 33 |
| `forward-slash` | ASCII code 47 |
| `greater-than` | ASCII code 62 |
| `hyphen` | ASCII code 45 |
| `left-paren` | ASCII code 40 |
| `less-than` | ASCII code 60 |
| `letters` | `lowercase` and/or `uppercase` |
| `lowercase` | ASCII characters a-z |
| `name` | `alphanumeric`, `hyphen`, `period`, `underscore`, `nonascii` |
| `newline` | ASCII code 10 (linefeed) |
| `nonascii` | utf-8 codes greater than 128 (multi-byte) |
| `number-sign` | ASCII code 35 |
| `numeric` | digits, period, hyphen, plus, caret, E, e |
| `percent-sign` | ASCII code 37 |
| `period` | ASCII code 46 |
| `plus` | ASCII code 43 |
| `printable` | ASCII 32 <= code < 127 |
| `question-mark` | ASCII code 63 |
| `right-paren` | ASCII code 41 |
| `semicolon` | ASCII code 59 |
| `single-quote` | ASCII code 39 |
| `tab` | ASCII code 09 |
| `text` | `printable` and/or `nonascii` excluding comma and curly braces.|
| `tilde` | ASCII code 126 |
| `underscore` | ASCII code 95 |
| `uppercase` | ASCII characters A-Z |
| `vertical-bar` | ASCII code 124 |
| Name | Description |
|-----------------|-----------------------------------------------------------------|
| `alphanumeric` | `letters` and/or `digits` |
| `ampersand` | ASCII code 38 |
| `ascii` | utf-8 codes 0 to 127 (single byte) |
| `asterisk` | ASCII code 42 |
| `at-sign` | ASCII code 64 |
| `backslash` | ASCII code 92 |
| `blank` | ASCII code 32 |
| `caret` | ASCII code 94 |
| `colon` | ASCII code 58 |
| `comma` | ASCII code 44 |
| `dollar` | ASCII code 36 |
| `digits` | 0-9 |
| `double-quote` | ASCII code 34 |
| `equals` | ASCII code 61 |
| `exclamation` | ASCII code 33 |
| `forward-slash` | ASCII code 47 |
| `greater-than` | ASCII code 62 |
| `hyphen` | ASCII code 45 |
| `left-paren` | ASCII code 40 |
| `less-than` | ASCII code 60 |
| `letters` | `lowercase` and/or `uppercase` |
| `lowercase` | ASCII characters a-z |
| `name` | `alphanumeric`, `hyphen`, `period`, `underscore`, `nonascii` |
| `newline` | ASCII code 10 (linefeed) |
| `nonascii` | utf-8 codes >= 160 (multi-byte) |
| `number-sign` | ASCII code 35 |
| `numeric` | digits, period, hyphen, plus, caret, E, e |
| `percent-sign` | ASCII code 37 |
| `period` | ASCII code 46 |
| `plus` | ASCII code 43 |
| `printable` | ASCII 32 <= code < 127 |
| `question-mark` | ASCII code 63 |
| `right-paren` | ASCII code 41 |
| `semicolon` | ASCII code 59 |
| `single-quote` | ASCII code 39 |
| `tab` | ASCII code 09 |
| `text` | `printable` and/or `nonascii` excluding comma and curly braces. |
| `tilde` | ASCII code 126 |
| `underscore` | ASCII code 95 |
| `uppercase` | ASCII characters A-Z |
| `vertical-bar` | ASCII code 124 |
3 changes: 3 additions & 0 deletions docs/source/Appendix_A.md
Original file line number Diff line number Diff line change
Expand Up @@ -199,6 +199,9 @@ behavior of certain value classes (for example the `numericClass` value class).
- Valid International Resource Identifier as standardized by [rfc3987](https://datatracker.ietf.org/doc/html/rfc3987).
``````
See [**2.2 Character sets and restrictions**](./02_Terminology.md#22-character-sets_and_restrictions) for
definitions of the various character class definitions.
````{admonition} Notes on rules for allowed characters in the HED schema.
:class: tip
Expand Down
4 changes: 2 additions & 2 deletions docs/source/Appendix_B.md
Original file line number Diff line number Diff line change
Expand Up @@ -25,15 +25,15 @@ of errors keyed to the HED specification.

A HED string contains an invalid character.

**a.** A non-printable character (ASCII code < 32 or == 127) appears in a HED string.
**a.** An invalid character (character code < 32 or 127 <= character code < 160) appears in a HED string.
**b.** Curly braces appear in a HED string not in a sidecar.


**Notes:**
- Starting with HED 8.3.0, HED supports UTF-8 encoding.
- Different parts of a HED string have different rules for acceptable characters.

See
See also:
[**3.2.4 Tags that take values**](03_HED_formats.md#324-tags-that-take-values) and
[**3.2.5: Tag extensions**](03_HED_formats.md#325-tag-extensions) for
an explanation of the rules for tag values and extensions.
Expand Down
2 changes: 1 addition & 1 deletion docs/source/conf.py
Original file line number Diff line number Diff line change
Expand Up @@ -95,4 +95,4 @@
html_static_path = ['_static']
html_css_files = [
'custom.css',
]
]
15 changes: 14 additions & 1 deletion tests/javascript_tests.json
Original file line number Diff line number Diff line change
Expand Up @@ -17,7 +17,8 @@
"tests": {
"string_tests": {
"fails": [
"Item/Bl\b"
"Item/Bl\b",
"Item/ABC\u009e"
],
"passes": [
"Red, Blue, Description/Red",
Expand Down Expand Up @@ -57,6 +58,18 @@
0,
"Item/Bl\b"
]
],
[
[
"onset",
"duration",
"HED"
],
[
4.5,
0,
"Item/{abc}"
]
]
],
"passes": [
Expand Down
7 changes: 6 additions & 1 deletion tests/json_tests/CHARACTER_INVALID.json
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,8 @@
"tests": {
"string_tests": {
"fails": [
"Item/Bl\b"
"Item/Bl\b",
"Item/ABC\u009E"
],
"passes": [
"Red, Blue, Description/Red",
Expand Down Expand Up @@ -42,6 +43,10 @@
[
["onset", "duration", "HED"],
[ 4.5, 0, "Item/Bl\b"]
],
[
["onset", "duration", "HED"],
[ 4.5, 0, "Item/{abc}"]
]
],
"passes": [
Expand Down
15 changes: 14 additions & 1 deletion tests/python_tests.json
Original file line number Diff line number Diff line change
Expand Up @@ -17,7 +17,8 @@
"tests": {
"string_tests": {
"fails": [
"Item/Bl\b"
"Item/Bl\b",
"Item/ABC\u009e"
],
"passes": [
"Red, Blue, Description/Red",
Expand Down Expand Up @@ -57,6 +58,18 @@
0,
"Item/Bl\b"
]
],
[
[
"onset",
"duration",
"HED"
],
[
4.5,
0,
"Item/{abc}"
]
]
],
"passes": [
Expand Down
6 changes: 3 additions & 3 deletions tests/run_consolidate_tests.py
Original file line number Diff line number Diff line change
Expand Up @@ -20,8 +20,8 @@ def combine_tests(test_names, test_dir, output_path):
def main(exclude_names=[], out_name='temp.json'):
relative_dir = "json_tests" # relative directory to read

script_dir = os.path.dirname(os.path.abspath(__file__)) # directory of this script
target_dir = os.path.join(script_dir, relative_dir) # full path of the
script_dir = os.path.dirname(os.path.abspath(__file__)) # directory of this script
target_dir = os.path.join(script_dir, relative_dir) # full path of the

# Write the indicated files
file_names = [f for f in os.listdir(target_dir) if os.path.isfile(os.path.join(target_dir, f))]
Expand All @@ -30,7 +30,7 @@ def main(exclude_names=[], out_name='temp.json'):


if __name__ == '__main__':
exclude_names =['SCHEMA', 'TAG_NAMESPACE', 'VERSION_DEPRECATED']
exclude_names = ['SCHEMA', 'TAG_NAMESPACE', 'VERSION_DEPRECATED']

javascript_name = "javascript_tests.json"
main(exclude_names, javascript_name)
Expand Down
8 changes: 0 additions & 8 deletions tests/test_summarize_testdata.py
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,6 @@ def setUpClass(cls):
cls.test_files = [os.path.join(test_dir, f) for f in os.listdir(test_dir)
if os.path.isfile(os.path.join(test_dir, f))]


@staticmethod
def get_test_info(test_file, details=True):
indent = " "
Expand Down Expand Up @@ -55,13 +54,6 @@ def test_summary(self):
print(out_str)
self.assertEqual(True, True) # add assertion here

# def test_summary_full(self):
# for test_file in self.test_files:
# print(test_file)
# out_str = self.get_test_info(test_file, details=True)
# print(out_str + '\n')
#
# self.assertEqual(True, True) # add assertion here


if __name__ == '__main__':
Expand Down

0 comments on commit 6b7b64e

Please sign in to comment.