-
Notifications
You must be signed in to change notification settings - Fork 478
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix support for non-ASCII characters in Oracle CLOBs #1184
base: master
Are you sure you want to change the base?
Conversation
No real changes, just make it possible to use these helpers in the unit test to be added. This commit is best viewed using Git --color-moved option.
This will allow to use it for creating functions as well as procedures (and also procedures with a name other than "soci_test" if this is ever needed).
We need to read the entire contents of the CLOB in Oracle backend and not just the number of bytes corresponding to its length in characters as returned by OCILobGetLength() because this may (and will) be strictly less than its full size in bytes for any encoding using multiple bytes per character, such as the de facto standard UTF-8. Also make reading CLOBs more efficient by doing what Oracle documentation suggests and using the LOB chunk size for reading. Finally, add a unit test checking that using non-ASCII strings in UTF-8 (which had to be enabled for the CI) with CLOBs does work. This commit is best viewed ignoring whitespace-only changes.
There might be an issue with the Fixed-Width Client-Side Character Set. In this case, Oracle states that the output value of amtp is in characters. However, I haven't been able to create a test case for it. |
On "streaming mode" the offset is ignored, except for the first call, so don't bother updating it. Also restructure the code in a slightly simpler way.
Read directly into the provided string instead of reading into a temporary buffer and then copying into the string.
Oracle documentation says
so I don't think it's ever in characters (except on input — using different units for an in/out parameter deserves some kind of a prize for the worst API design ever). |
|
Hmm, yes. I wonder if they really include UTF-16 and UTF-32 in "fixed width encodings", I have a feeling that they might be speaking about pre-Unicode fixed width encodings only (e.g. CP1252 etc). In any case, I can't test this neither: setting |
Currently, only AL16UTF16 is relevant. But SOCI is not ready to use it |
I get an error (without any details) from |
The username and password must be in Unicode in this case, and SQL texts must also be in Unicode. I worked on it today but didn't achieve any results. |
Replaces #1183.