"freerdp_uniconv_out" error dealing with multi-byte characters under Windows #11

cuzz · 2011-04-24T06:37:27Z

freerdp Version: v0.8.2-515-gc1f0fe0
OS: Windows Xp SP3 ( Chinese )
Symptoms: When connecting to a RDP Server from Windows( Chinese version), the wfreerdp process collapsed .
Analysis: When parameter str is multi-byte characters (for example, Chinese,Kerean, Japanese), the function *** freerdp_uniconv_out *** (libfreerdputils,unicode.c) deals in a wrong way.

The easiest way to solve the problem is to use the Windows API in the Windows environment:

#if ( defined(WINDOWS) || defined(_WIN32)   ||defined(_WINDOWS) )
    #include  <windows.h>
#endif

/* Convert str from DEFAULT_CODEPAGE to WINDOWS_CODEPAGE and return buffer like xstrdup.
 * Buffer is 0-terminated but that is not included in the returned length. */

char* freerdp_uniconv_out(UNICONV *uniconv, char *str, size_t *pout_len)
{
    size_t ibl = strlen(str), obl ; /* FIXME: worst case */
    char *pin = str,  *pout0 ;

#if ( defined(WINDOWS) || defined(_WIN32)   ||defined(_WINDOWS) )

    obl = MultiByteToWideChar (CP_ACP, 0, pin, -1, NULL, 0); 
    pout0 = xmalloc(obl*2 );
    MultiByteToWideChar (CP_ACP, 0, pin, -1, (wchar_t*)pout0, obl);
    pout0[obl*2-1]=0;
    pout0[obl*2-2]=0;
    *pout_len=obl*2-2;
#else

    char *pout;

    obl = 2 * ibl;
    pout0 = xmalloc(obl + 2);
    pout = pout0;

#ifdef HAVE_ICONV
    if (iconv(uniconv->out_iconv_h, (ICONV_CONST char **) &pin, &ibl, &pout, &obl) == (size_t) - 1)
    {
        printf("freerdp_uniconv_out: iconv() error\n");
        return NULL;
    }
#else
    while ((ibl > 0) && (obl > 0))
    {
        if ((signed char)(*pin) < 0)
        {
            return NULL;
        }
        *pout++ = *pin++;
        *pout++ = 0;
        ibl--;
        obl -= 2;
    }
#endif

    if (ibl > 0)
    {
        printf("freerdp_uniconv_out: string not fully converted - %d chars left\n", (int) ibl);
    }

    *pout_len = pout - pout0;
    *pout++ = 0;    /* Add extra double zero termination */
    *pout = 0;

#endif

    return pout0;
}

otavio · 2011-05-13T22:28:28Z

Have you been able to reproduce this issue using current master branch?

cuzz · 2011-05-14T02:27:08Z

Yes, I have tested the current master branch, the issue still exists.
Moreover, the issue also exists in function "freerdp_uniconv_in".

Chinese characters do not necessarily account for every 2 bytes.
In fact, many characters occupy four bytes.

GB 18030-2000 is the new “compulsory” Chinese national standard.

http://en.wikipedia.org/wiki/GB_18030

The mandatory part of GB 18030-2005 consists of 1 byte and 2 byte encoding, together with 4 byte encoding for CJK Unified Ideographs Extension A. The corresponding Unicode code points of this subset lie entirely in the BMP.

In a move of historic significance for software supporting Unicode, the PRC decided to mandate support of certain code points outside the BMP. This means that software can no longer get away with treating characters as 16 bit fixed width entities (UCS-2). Therefore they must either process the data in a variable width format (such as UTF-8 or UTF-16), which are the most common choices, or move to a larger fixed width format (such as UCS-4 or UTF-32). Microsoft made the change from UCS-2 to UTF-16 with Windows 2000."

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

"freerdp_uniconv_out" error dealing with multi-byte characters under Windows #11

"freerdp_uniconv_out" error dealing with multi-byte characters under Windows #11

cuzz commented Apr 24, 2011

otavio commented May 13, 2011

cuzz commented May 14, 2011

"freerdp_uniconv_out" error dealing with multi-byte characters under Windows #11

"freerdp_uniconv_out" error dealing with multi-byte characters under Windows #11

Comments

cuzz commented Apr 24, 2011

otavio commented May 13, 2011

cuzz commented May 14, 2011