-
Notifications
You must be signed in to change notification settings - Fork 478
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
UTF-16 support for ODBC (for MSSQL) #1041
Conversation
e9e296e
to
832a796
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for working on this, but I don't know how do I feel about introducing a separate build variant for Unicode support. IMO it would be really better to just use UTF-8 in all builds, instead of requiring a special build mode to handle Unicode.
And it would be really nice to have some description of this option in the docs, if only to explain what does enabling it change.
Finally, even if this is relatively trivial, it looks like the use of SQLTCHAR
could avoid some preprocessor checks in the code.
Co-authored-by: VZ <vz-github@zeitlins.org>
…d/soci into odbc_unicode_support # Conflicts: # docs/installation.md
maybe connection_parameters can be abused to decide at runtime what type of char / string is stored in the database. I think it might also be possible to SQLDescribeCol and determine at runtime then whether to convert back and forth. But the performance penalty is unacceptable. Or an additional type could be introduced:
The user facing interface would then still only support UTF-8 (note: no std::wstring). |
I see, thanks.
We do need to add Ideally, this should be automatic, i.e. when exchanging data with The main point is that I'd really, really love to avoid different incompatible builds. The conditional compilation directives in the tests are a good example of how we do not want the code using SOCI to look like. And there are other problems, e.g. we'd need to add wide char builds to the CI too if we do it like this and I'd rather avoid it. |
Okay, I think we're on the same page. Shall there be an implicit conversion to std::string or shall std::wstring be supported in case of SQLWCHAR based columns? |
Also note that wstring_convert has been deprecated and scheduled for removal from the standard (without replacement). It's a shame, but either way everyone here should be aware of this 👀 |
That could temporarily (until there's a replacement) be solved with icu, libiconv or alike. But I wasn't sure if that's acceptable at this point. |
I think it would make sense to support
I'd rather not pull in ICU just for this and libiconv is Unix-only. If necessary, I can contribute my own code, written many years ago, converting between UTF-8 and
|
That sounds good! Okay, I'll add std::wstring support in a separate PR first. Afterwards we can add the conversion to std::string. |
We might want to move the discussion to here: |
This pull request adds support for UTF-16 encoding in the ODBC module. The module previously only supported UTF-8 encoding, which made it difficult to exchange data with non-UTF-8 MSSQL databases properly.
With this new feature, the ODBC module can now properly exchange data with MSSQL databases that use UTF-16 encoding. This is especially important for users who work with databases that use different encodings or require internationalization support.
The implementation uses std::wstring_convert to convert between UTF-8 and UTF-16 encodings. The toUtf16() and toUtf8() functions have been added to handle the conversion between the two encodings.
There seem to have been issues in the past:
#164
#179
#1111