-
Notifications
You must be signed in to change notification settings - Fork 214
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Can't decode ill-formed UTF-8 octet sequence when using --set argument #809
Comments
Hrm. That's coming from the spool method, which spools output from the database CLI. Which RDBMS are you using? I wonder if it's not properly emitting UTF-8 for some reason. |
I'm using PostgreSQL. |
Is your database using UTF-8 encoding? Check client_encoding. |
Sorry, actually Postgres works fine, the problem is with Oracle database. My Oracle database is configured with WE8MSWIN1252 charset: SQL> select *
2 from nls_database_parameters
3 where parameter in ('NLS_RDBMS_VERSION', 'NLS_DATE_LANGUAGE', 'NLS_NCHAR_CHARACTERSET', 'NLS_CHARACTERSET', 'NLS_ISO_CURRENCY', 'NLS_TERRITORY', 'NLS_LANGUAGE')
4 order by 1;
PARAMETER VALUE
-------------------------------------------------------------------------------- ----------------------------------------------------------------
NLS_CHARACTERSET WE8MSWIN1252
NLS_DATE_LANGUAGE AMERICAN
NLS_ISO_CURRENCY AMERICA
NLS_LANGUAGE AMERICAN
NLS_NCHAR_CHARACTERSET AL16UTF16
NLS_RDBMS_VERSION 19.0.0.0.0
NLS_TERRITORY AMERICA
7 rows selected But when I define the Sql*Plus variable with the same value it works fine: SQL> define test = "ãçê"
SQL> define
DEFINE _SQLPLUS_RELEASE = "000000000" (CHAR)
DEFINE _EDITOR = "PLSQLDev" (CHAR)
DEFINE _DATE = "17/01/2024" (CHAR)
DEFINE _PRIVILEGE = "" (CHAR)
DEFINE _O_VERSION = "Oracle Database 19c Enterprise Edition Release 19.0.0.0.0 " (CHAR)
DEFINE _O_RELEASE = "000000000" (CHAR)
DEFINE test = "ãçê" (CHAR)
SQL> select '&test' from dual;
'ÃÇÊ'
-----
ãçê |
Sqitch requires everything to be UTF-8, and it sets up the Oracle connection for that: sqitch/lib/App/Sqitch/Engine/oracle.pm Lines 20 to 27 in 7edc536
So be sure that the contents of the variable are UTF-8 and not CP1252. |
I tried a few things to guarantee the variable is in UTF-8: root@18d3f3eb022a:/repo# sqitch deploy flipr_test --set var="$(echo "ãçê")"
Deploying changes to flipr_test
+ appschema ................. not ok
Can't decode ill-formed UTF-8 octet sequence <E3> at end of file at /bin/../lib/perl5/App/Sqitch.pm line 480.
Deploy failed
root@18d3f3eb022a:/repo# sqitch deploy flipr_test --set var="$(echo "ãçê" | iconv -t UTF-8)"
Deploying changes to flipr_test
+ appschema ................. not ok
Can't decode ill-formed UTF-8 octet sequence <E3> at end of file at /bin/../lib/perl5/App/Sqitch.pm line 480.
Deploy failed
root@18d3f3eb022a:/repo# echo "ãçê" | file -
/dev/stdin: UTF-8 Unicode text
root@18d3f3eb022a:/repo# echo -n "ãçê" | file -
/dev/stdin: UTF-8 Unicode text, with no line terminators I also tested deploying in another Oracle Database that is configured with NLS_CHARACTERSET AL32UTF8, but I got the same error message. |
I think the problem is not the text read from the command-line, but text returned from SQLPlus. If it doesn't emit UTF-8 then Sqitch will have trouble reading it. Could SQLPlus be ignoring Try this patch: --- a/lib/App/Sqitch.pm
+++ b/lib/App/Sqitch.pm
@@ -472,6 +472,7 @@ sub spool {
}
local $SIG{PIPE} = sub { die 'spooler pipe broke' };
+ binmode $fh, ':raw';
if (ref $fh eq 'ARRAY') {
for my $h (@{ $fh }) {
print $pipe $_ while <$h>; It should treat the data read from SQL*Plus as raw bytes, rather than try to decode them as UTF-8. Will be interesting to see what it emits. |
The patch works! |
Yes, the patch works because it's not decoding the output from SQLPlus into UTF-8. Which means SQLPlus is not emitting UTF-8. Clearly there's some additional configuration we need to make to get it to do so. |
Sure seems like setting |
I tried setting |
Yeah, Sqitch sets sqitch/lib/App/Sqitch/Engine/oracle.pm Lines 20 to 27 in 7edc536
|
If I set to |
For some reason SQL*Plus is returning non-UTF-8 bytes. I do not understand why and don't have the resources to investigate. Maybe it's worth asking on Stack Overflow or an Oracle forum? The key thing is to get SQL*Plus to emit only UTF-8 bytes. Though honestly I have no idea what it thinks it's emitting. Your change name is ASCII, so should be fine. When you tried the patch a few weeks ago, you said it worked; what, exactly, did it emit? Very curious to know what it might be choking on. |
I inserted the content of |
Is the deploy script you wrote UTF-8 encoded? |
Yes, all the .sql scripts inside my project are UTF-8 (without BOM) encoded and Windows (CR LF). |
Sqitch is failing whenever I set a variable with a value that contains an accent:
The same happens with
revert
andrebase
commands.I tried setting LC_ALL & LANG envs to pt_BR.UTF-8, but the error continues.
The text was updated successfully, but these errors were encountered: