Skip to content

1.0.5

Compare
Choose a tag to compare
@Joungkyun Joungkyun released this 11 May 15:37
· 36 commits to master since this release

Changes:

  • #8 fixed can not detect UTF-16/32.

    • This is binary safe problems
    • In order to solve this problems, support _detect_r_ and _detect_handledata_r_ API.
    • Support _CHARDET_BINARY_SAFE_ consantant whether support _detect_r_ or _detect_handledata_r_
    #ifdef CHARDET_BINARY_SAFE
            if ( detect_r (str[i], strlen (str[i]), &obj) == CHARDET_OUT_OF_MEMORY )
    #else
            if ( detect (str[i], &obj) == CHARDET_OUT_OF_MEMORY )
    #endif
            {
                fprintf (stderr, "On handle processing, occured out of memory\n");
                return CHARDET_OUT_OF_MEMORY;
            } 
    
    #ifdef CHARDET_BINARY_SAFE
            if ( detect_handledata_r (&d, str[i], strlen (str[i]), &obj) == CHARDET_OUT_OF_MEMORY )
    #else
            if ( detect_handledata (&d, str[i], &obj) == CHARDET_OUT_OF_MEMORY )
    #endif
            {
                fprintf (stderr, "On handle processing, occured out of memory\n");
                return CHARDET_OUT_OF_MEMORY;
            }
  • Merge uchardet's improves

    • #6 fixed extended character range on EUT-KR and EUC-TW
      • can detect CP949 (for example, "똠방각하", "뷁")
      • can detect extended EUC-TW ("灣,是,台" and so on)
    • #2, #5 Improve single-byte charset detection confidence algorithm
    • #4 New single-byte language model
      • Arabic
      • Danash
      • Esperanto
      • German
      • Spanish
      • Turkish
      • Vietnamese
  • #3 Update language model of Greek, Hungarian and Thai

  • fixed man pages wrong macro bug (martin.gansser@gmail.com)