Skip to content

Seeking multiple times #279

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
Jul 11, 2025
Merged

Seeking multiple times #279

merged 3 commits into from
Jul 11, 2025

Conversation

197g
Copy link
Member

@197g 197g commented Jul 10, 2025

This fixes an issue with cycle tracking that got introduced when we allowed seeking to previous images. Since the next offset of an IFD had to be unique, seeking back would not allow iterating forwards again and bring the decoder into a rather unfortunate state that was not consistent with actually rewinding as intended.

197g added 3 commits July 10, 2025 23:48
The previous strategy could not keep track of all cycles such as
reflexive ones at the very start. Additionally, it would fail to be able
to iterate after seeking since it was overly cautious and verified the
next IFD position before allowing the read of a current IFD.

The new implementation is furthermore proof against having multiple
independent chains of IFD which may occur when iterating SubIFDs for
thumbnails. (Note that we do not verify the tree-structure, only a
freedom from cycles).
@197g 197g force-pushed the seeking-multiple-times branch from ac575fe to 97e2305 Compare July 10, 2025 21:57
@197g
Copy link
Member Author

197g commented Jul 10, 2025

There's another motivation here: This new cycle tracking implementation will allow the decoder to switch the image chain to another one, that is not in the original IFD chain. This is required for tags such as SubIFD (thumbnails etc.) that refer one image to a separate chain of child images.

@197g 197g merged commit 6275b7a into main Jul 11, 2025
15 checks passed
@197g 197g deleted the seeking-multiple-times branch July 11, 2025 11:21
@fintelia
Copy link
Contributor

I think letting the Decoder struct switch to a different chain would add a bunch of complexity that might be better handled by having a dedicated API for reading IFD's off the main chain. The core problem is that on the main chain the TIFF spec provides guarantees of things like each image having either strips or tiles but not both. These let us check invariants immediately when we seek to an IFD and then assume they hold in the various decoding methods. In sub-IFD's, the tags required for main chain IFDs may be missing. In certain cases, the tags in a sub-IFD could have could have entirely different meanings.

@197g
Copy link
Member Author

197g commented Jul 11, 2025

I'll sketch this out in the PR for it but my impression is the decoder could also have a non-image iteration mode without that becoming overly complex.

For instance, when we first switch to another chain we may set a flag in the decoder to no longer read images ahead-of-time and then only deactivate that flag when explicitly asked. Or, offer a next_directory as an alternative to next_image with the separate behavior and expect the user to advance with the correct interface. For some chains such as EXIF we do not intend to read any image. The open question is how many different types of chains to support and how many to let the user do manually through read_directory_tags and Directory::next (#278).

Also, he current behavior has downsides. We should per specification be able to handle color channels beyond the rgba and simply skip bytes that do not belong to them according to the BitsPerSampe array. However, which bytes to ignore and which bytes to read is something only the user can choose. The current implementation makes it hard to perform that correctly since user's choice may depend on the directory—which they can only inspect after the seek, which already triggers image validation. So there is a bunch of work that may be ultimately pointless and really is a single call away.

next_image being the default with validation is good, but a next_directory alternative would not add much complexity cost while keeping the implementation more usable through Decoder.

@fintelia
Copy link
Contributor

Ok, I think that makes sense.

I could also see adding methods like Decoder::as_image() -> Option<Image> and Decoder::as_directory() -> &mut Directory so that we could enable more sharing and have fewer methods directly on Decoder.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants