Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: Detect corrupted files, exit on exception (DEV-4457) #459

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

siers
Copy link
Contributor

@siers siers commented Dec 17, 2024

This corruption warning is non-fatal, so we have to check if the error code is set to a specific JWRN after jpeg_read_scanlines. Test added that an exception is created in img.read.


References:

https://github.com/mozilla/mozjpeg/blob/6c9f0897afa1c2738d7222a0a9ab49e8b536a267/jerror.h#L195

JMESSAGE(JWRN_HUFF_BAD_CODE, "Corrupt JPEG data: bad Huffman code")

https://github.com/mozilla/mozjpeg/blob/6c9f0897afa1c2738d7222a0a9ab49e8b536a267/jerror.h#L280-L283

/* Nonfatal errors (we can keep going, but the data is probably corrupt) */
#define WARNMS(cinfo,code)  \
  ((cinfo)->err->msg_code = (code), \
   (*(cinfo)->err->emit_message) ((j_common_ptr) (cinfo), -1))

https://github.com/mozilla/mozjpeg/blob/6c9f0897afa1c2738d7222a0a9ab49e8b536a267/jdphuff.c#L551-L552

        if (s) {
          if (s != 1)           /* size of new coef should always be 1 */
            WARNMS(cinfo, JWRN_HUFF_BAD_CODE);
          CHECK_BIT_BUFFER(br_state, 1, goto undoit);

https://github.com/mozilla/mozjpeg/blob/6c9f0897afa1c2738d7222a0a9ab49e8b536a267/jdhuff.c#L463-L466

  if (l > 16) {
    WARNMS(state->cinfo, JWRN_HUFF_BAD_CODE);
    return 0;                   /* fake a zero as the safest result */
  }

Copy link

linear bot commented Dec 17, 2024

@siers siers force-pushed the feat/DEV-4457-detect-corrupt branch 2 times, most recently from 86469df to e02685a Compare December 18, 2024 14:11
@siers siers force-pushed the feat/DEV-4457-detect-corrupt branch from e02685a to 6ad6dc5 Compare December 18, 2024 14:27
@@ -862,6 +862,7 @@ int main(int argc, char *argv[])
}
} catch (Sipi::SipiImageError &err) {
std::cerr << err << std::endl;
exit(1);

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmmm, does that mean that in the past, all SipiImageError resulted in return code 0, but now, they will all result in return code 1?

That would not be the expected behaviour. Because often, an image is only slightly corrupted, but perfectly readable by most viewers. For example, the ca. 66'000 images of LHTT produced ca. 14'000 error logs. But most of them are negligible, because there is no real defect in the image. We'd like that SIPI only aborts with non-0 in these very specific cases:

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very odd, but it was also my intention to check whether this new behavior breaks the build. Unfortunately the main build is broken. 🤦

Anyhow, I'll restrict it to the behavior you've described.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants