Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

project.py: Specify the manifest file to use UTF-8 encoding #710

Closed
wants to merge 2 commits into from

Conversation

ylz0923
Copy link

@ylz0923 ylz0923 commented May 20, 2024

Specify the manifest file to use UTF-8 encoding

@marc-hb
Copy link
Collaborator

marc-hb commented May 20, 2024

Could you please file a bug with some reproduction steps?

Encoding issues are never as simple as they seem...

@@ -321,7 +321,7 @@ def bootstrap(self, args) -> Path:
# Parse the manifest to get "self: path:", if it declares one.
# Otherwise, use the URL. Ignore imports -- all we really
# want to know is if there's a "self: path:" or not.
manifest = Manifest.from_data(temp_manifest.read_text(),
manifest = Manifest.from_data(temp_manifest.read_text('utf-8'),
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

locale.getencoding() is used by default. If it does not already return utf8 on a system then why should west ignore the system's default and hardcode it?

https://docs.python.org/3/library/functions.html#open

Please try this interactively on your system and share the output:

python
import locale
locale.getencoding()

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just noticed 3.8 was different :-( https://docs.python.org/3.8/library/functions.html#open

Please also run this:

python
import locale
locale.getpreferredencoding()
locale.getdefaultlocale()
locale.getencoding()

Copy link
Author

@ylz0923 ylz0923 May 21, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

>>> import locale
>>> locale.getencoding()
'cp936'
>>>

>>> import locale
>>> locale.getpreferredencoding()
'cp936'
>>> locale.getdefaultlocale()
<stdin>:1: DeprecationWarning: Use setlocale(), getencoding() and getlocale() instead
('zh_CN', 'cp936')
>>> locale.getencoding()
'cp936'
>>>

how to reproduction:
west init -m URL --mr main --mf west.yml myworkspace
zhe west.yml using UTF-8 encoding has Chinese annotation.
error out:
Traceback (most recent call last):
File "", line 198, in _run_module_as_main
File "", line 88, in run_code
File "C:\Python311\Scripts\west.exe_main
.py", line 7, in
File "C:\Python311\Lib\site-packages\west\app\main.py", line 1085, in main
app.run(argv or sys.argv[1:])
File "C:\Python311\Lib\site-packages\west\app\main.py", line 244, in run
self.run_command(argv, early_args)
File "C:\Python311\Lib\site-packages\west\app\main.py", line 503, in run_command
self.run_builtin(args, unknown)
File "C:\Python311\Lib\site-packages\west\app\main.py", line 611, in run_builtin
self.cmd.run(args, unknown, self.topdir,
File "C:\Python311\Lib\site-packages\west\commands.py", line 194, in run
self.do_run(args, unknown)
File "C:\Python311\Lib\site-packages\west\app\project.py", line 224, in do_run
topdir = self.bootstrap(args)
^^^^^^^^^^^^^^^^^^^^
File "C:\Python311\Lib\site-packages\west\app\project.py", line 313, in bootstrap
manifest = Manifest.from_data(temp_manifest.read_text(),
^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Python311\Lib\pathlib.py", line 1059, in read_text
return f.read()
^^^^^^^^
UnicodeDecodeError: 'gbk' codec can't decode byte 0xaf in position 151: illegal multibyte sequence

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you try $Env:PYTHONUTF8 = 1 (in powershell) and then try west again?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It process successfully, use $Env:PYTHONUTF8 = 1 than west init -m URL --mr main --mf west.yml myworkspace

Copy link
Collaborator

@marc-hb marc-hb May 24, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for testing! So you don't need this PR, correct?

If you prefer to switch your entire Windows system to UTF8, you can do it like this:
https://stackoverflow.com/questions/57131654/using-utf-8-encoding-chcp-65001-in-command-prompt-windows-powershell-window
Then you won't have to choose $Env:PYTHONUTF8 = 1 (or not) for every application and west project.

There is a lot of other, useful information on that page, it's not just about powershell.

Also note the Python default will change in Python 3.15: https://peps.python.org/pep-0686/

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When other places(manifest.py) call the read_text interface, they specify the encoding method, and I think it’s necessary here as well. 😊

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When other places(manifest.py) call the read_text interface, they specify the encoding method,

Good point. Can you please fetch and test #711 instead?

marc-hb added a commit to marc-hb/west that referenced this pull request May 28, 2024
Fixes issue reported in PR zephyrproject-rtos#710 where most places are hardcoded to
'utf-8' while this one is (Windows) locale-dependent. In the future, we
may want to make this more flexible but the most urgent fix is
consistency: with this commit, manifest decoding should be hardcoded to
'utf-8' everywhere.

Signed-off-by: Marc Herbert <marc.herbert@intel.com>
Copy link
Collaborator

@marc-hb marc-hb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Superseded by larger #711

marc-hb added a commit to marc-hb/west that referenced this pull request Jun 14, 2024
Fixes issue reported in PR zephyrproject-rtos#710 where most places are hardcoded to
'utf-8' while this one is (Windows) locale-dependent. In the future, we
may want to make this more flexible but the most urgent fix is
consistency: with this commit, manifest decoding should be hardcoded to
'utf-8' everywhere.

Signed-off-by: Marc Herbert <marc.herbert@intel.com>
marc-hb added a commit that referenced this pull request Jun 26, 2024
Fixes issue reported in PR #710 where most places are hardcoded to
'utf-8' while this one is (Windows) locale-dependent. In the future, we
may want to make this more flexible but the most urgent fix is
consistency: with this commit, manifest decoding should be hardcoded to
'utf-8' everywhere.

Signed-off-by: Marc Herbert <marc.herbert@intel.com>
@marc-hb
Copy link
Collaborator

marc-hb commented Jun 26, 2024

I just merged larger #711.

@ylz0923 please close this one if confirmed that #711 works for you.

Thanks for your help and patience!

@marc-hb marc-hb closed this Jul 1, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants