Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

patterns/tiff: support BigTIFF and tiled TIFF; bugfixes and cleanups #159

Merged
merged 15 commits into from
Sep 24, 2023

Conversation

bgilbert
Copy link
Contributor

@bgilbert bgilbert commented Sep 3, 2023

Major changes:

  • Support BigTIFF (64-bit TIFF)
  • Support tiled TIFFs
  • Clean up strip array generation; don't omit any partial last strip
  • Correctly distinguish between values and offsets based on total size, rather than guessing from the field type
  • Correctly represent field widths for small values in little-endian TIFFs
  • Some display improvements
  • Various code cleanups
  • Add additional test cases

See individual commits for details.

cc: the original authors, @jacksonbarreto and @joelalves

@bgilbert
Copy link
Contributor Author

bgilbert commented Sep 9, 2023

I've split out multiple-test support to #161, and marked this PR as draft until that lands.

Don't require unfolding the array entry to see what tag it contains.
Rationals, uniquely, are primitives with two fields.  Add a struct to
represent this, rather than inlining them.
Small fields are always left-aligned in the 4-byte Value Offset.  On
little-endian TIFFs we currently cheat this by declaring a 32-bit value
and letting little-endianness handle the semantics.  However, this adds
some extra conditionals, and misrepresents the resulting field as 32 bits.
Drop the cheat.
We were using the field type to make assumptions about whether the Value
Offset is a Value or an Offset, which is incorrect.  If the Count
multiplied by the field size is larger than 4, the field is an Offset;
otherwise it's a Value.

Add display sugar for single-element arrays to avoid extra nesting.
get_ifds_offsets() and BIG/LITTLE aren't used at all.  get_total_IFDs()
is only used for declaring the length of TIFFFile.IFDs, and isn't needed
because IFDs are structured as a linked list.
The call in TIFFFile is redundant.  Drop both calls and open-code the
check at the top level, before executing any code.  The BigTIFF check
will eventually be added alongside this one.

Fail if we don't recognize the magic number.
They're redundant with the fields in the DirectoryEntry array.  Also
they're buggy: they assume the field Value Offsets are always offsets,
which isn't true for single-strip IFDs, and they ignore a partial last
strip in multiple-strip IFDs.
We're making extra work for ourselves by avoiding the type system.  Also,
by calculating the number of strips we expect rather than the number of
strips we actually have, we're miscounting and omitting any partial last
strip.

Instead, read the strip offsets and byte counts directly from the
IFDEntry array.
Use 64-bit temporary variables for values that can be 64 bits in BigTIFF.
@bgilbert
Copy link
Contributor Author

Rebased and marked ready for review!

@WerWolv
Copy link
Owner

WerWolv commented Sep 24, 2023

Thanks a lot!

@WerWolv WerWolv merged commit 7ecd6d8 into WerWolv:master Sep 24, 2023
2 checks passed
@bgilbert bgilbert deleted the tiff branch September 24, 2023 20:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants