Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

FIX: Support UTF-8 encoding for JSON files #1357

Merged
merged 7 commits into from
Jan 1, 2025
Merged
Show file tree
Hide file tree
Changes from 5 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions doc/whats_new.rst
Original file line number Diff line number Diff line change
Expand Up @@ -23,6 +23,7 @@ The following authors had contributed before. Thank you for sticking around!

* `Stefan Appelhoff`_
* `Daniel McCloy`_
* `Scott Huberty`_

Detailed list of changes
~~~~~~~~~~~~~~~~~~~~~~~~
Expand All @@ -47,6 +48,7 @@ Detailed list of changes
^^^^^^^^^^^^

- :func:`mne_bids.read_raw_bids` can optionally return an ``event_id`` dictionary suitable for use with :func:`mne.events_from_annotations`, and if a ``values`` column is present in ``events.tsv`` it will be used as the source of the integer event ID codes, by `Daniel McCloy`_ (:gh:`1349`)
- :func:`mne_bids.make_dataset_description` now correctly encodes the dataset description as UTF-8 on disk, by `Scott Huberty`_ (:gh:`1357`)

⚕️ Code health
^^^^^^^^^^^^^^
Expand Down
12 changes: 10 additions & 2 deletions mne_bids/tests/test_write.py
Original file line number Diff line number Diff line change
Expand Up @@ -376,7 +376,7 @@ def test_make_dataset_description(tmp_path, monkeypatch):
make_dataset_description(
path=tmp_path,
name="tst2",
authors="MNE B., MNE P.",
authors="MNE B., MNE P., MNE Ł.",
funding="GSOC2019, GSOC2021",
references_and_links="https://doi.org/10.21105/joss.01896",
dataset_type="derivative",
Expand All @@ -386,7 +386,15 @@ def test_make_dataset_description(tmp_path, monkeypatch):

with open(op.join(tmp_path, "dataset_description.json"), encoding="utf-8") as fid:
dataset_description_json = json.load(fid)
assert dataset_description_json["Authors"] == ["MNE B.", "MNE P."]
assert dataset_description_json["Authors"] == ["MNE B.", "MNE P.", "MNE Ł."]

# If the text on disk is unicode, json.load will convert it. So let's test that the
# text was encoded correctly on disk.
with open(op.join(tmp_path, "dataset_description.json"), encoding="utf-8") as fid:
scott-huberty marked this conversation as resolved.
Show resolved Hide resolved
# don't use json.load here, as it will convert unicode to str
dataset_description_string = fid.read()
# Check that U+0141 was correctly encoded as Ł on disk
assert "MNE Ł." in dataset_description_string

# Check we raise warnings and errors where appropriate
with pytest.raises(
Expand Down
2 changes: 1 addition & 1 deletion mne_bids/utils.py
Original file line number Diff line number Diff line change
Expand Up @@ -233,7 +233,7 @@ def _write_json(fname, dictionary, overwrite=False):
f'"{fname}" already exists. Please set overwrite to True.'
)

json_output = json.dumps(dictionary, indent=4)
json_output = json.dumps(dictionary, indent=4, ensure_ascii=False)
with open(fname, "w", encoding="utf-8") as fid:
fid.write(json_output)
fid.write("\n")
Expand Down
Loading