Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Incorrect upscaling for some files with size=max #680

Closed
mmatela opened this issue Aug 1, 2024 · 20 comments
Closed

Incorrect upscaling for some files with size=max #680

mmatela opened this issue Aug 1, 2024 · 20 comments

Comments

@mmatela
Copy link

mmatela commented Aug 1, 2024

Using cantaloupe 5.0.6, default settings except for FilesystemSource.BasicLookupStrategy.path_prefix. Java 21.0.1 on Windows, but experienced the same on other versions of Java on Linux.

Attaching sample file 2.jpg 2

When trying to get a fragment of size 100x100 with this url: http://localhost:8182/iiif/3/2.jpg/100,100,100,100/max/0/default.jpg i get the fragment upscaled to size 2604x2604 instead.

It works correctly when using size pct:100 instead of max (should there ever be a difference for these two?): http://localhost:8182/iiif/3/2.jpg/100,100,100,100/pct:100/0/default.jpg
It works correctly with image api v2: http://localhost:8182/iiif/2/2.jpg/100,100,100,100/full/0/default.jpg
Somehow it accepts size=max for v2, and also returns upscaled to 2604x2604: http://localhost:8182/iiif/2/2.jpg/100,100,100,100/max/0/default.jpg

I tried some other files in both jpg and other formats and for most of them it correctly returns 100x100, so it must be something inside the content file, but I'm not aware of any properties that should lead to this result.

@DiegoPino
Copy link
Contributor

Hi @mmatela. Could you share the info.json output for that image?
If this was a general issue for all the images I would say there is a problem/order of operation with the Specs implementation for /max but since you reported it happens just in some images it might be something specific maybe to A) wrongly processed by cantaloupe resolution (your image has 400dpi) ? wrongly parsed Exif? or even stale caches?. Still the specs are clear.
Region THEN Size THEN Rotation THEN Quality THEN Format
For the stale caches part (this is just an idea, not stating really a solution but worth trying) you can call the tile URL with

Overriding Cache Behavior
The behavior of the [caches](https://cantaloupe-project.github.io/manual/5.0/caching.html) can be overridden on a per-request basis by supplying a cache URL query argument with one of the following values:

nocache or false
Bypasses the source, derivative, and info caches, and also omits any Cache-Control response header that may be configured.
recache
Bypasses the derivative and info caches for reads but not writes.

I will check the code that sets up the operations and see what I can find out

@mmatela
Copy link
Author

mmatela commented Aug 1, 2024

Thanks for looking into this @DiegoPino

Attaching info from http://localhost:8182/iiif/3/2.jpg/info.json :
info.json

IrfanView shows the file indeed has resolution of 400x400 dpi.

I think we can rule out cache, this is the first time I've started Cantaloupe on this machine and this 2.jpg file has never changed.

@DiegoPino
Copy link
Contributor

DiegoPino commented Aug 1, 2024

hi @mmatela. I tested locally on my self built Cantaloupe (Docker) and I can't reproduce it.
I tested java2D, turbojpeg and Jai processors. Both version 2 and 3
Calling (in my case)
https://localhost:8183/iiif/3/d66%2Fimage-354200823-eca73bfc-7194-4a50-9830-beee020d9e1a-ebce0b81-b3ab-4200-b124-0a95d5202959.jpg/100,100,100,100/max/0/default.jpg

Always generated

image

Not sure if the act of uploading the image to GitHub/me downloading it might have re-encoded the JPEG?

I used this docker container: esmero/cantaloupe-s3:6.0.1-multiarch. It is a year old based on development, so probably closer to
v5.0.5 with JDK 15

If you could give 5.0.5 a try while I figure out what has changed/is different?

I can't try a vanilla 5.0.6 today but will put in my todo list.

@mmatela
Copy link
Author

mmatela commented Aug 1, 2024

I see the same problematic behavior with 5.0.4 and 5.0.5. With the file downloaded from github through this link: https://github.com/user-attachments/assets/eca73bfc-7194-4a50-9830-beee020d9e1a (redirected the browser to amazonaws, then I pressed ctrl+s)

@DiegoPino
Copy link
Contributor

Wonder if java 21 has something to do with this. Do you see any errors/warnings in your cantaloupe log?

@mmatela
Copy link
Author

mmatela commented Aug 1, 2024

I first noticed this problem on a Linux machine with Java 11.

console output during request handling (changed the file name to make sure it's loaded from scratch):

e.i.l.c.r.i.v.ImageResource - Handling GET /iiif/3/354200823-eca73bfc-7194-4a50-9830-beee020.jpg/100,100,100,100/max/0/default.jpg
e.i.l.c.r.i.v.ImageResource - Request headers: Cookie: [...................]
e.i.l.c.i.MetaIdentifier - [Raw path component: 354200823-eca73bfc-7194-4a50-9830-beee020.jpg] -> [decoded: 354200823-eca73bfc-7194-4a50-9830-beee020.jpg] -> [slashes substituted: 354200823-eca73bfc-7194-4a50-9830-beee020.jpg]
e.i.l.c.i.Identifier - [Raw path component: 354200823-eca73bfc-7194-4a50-9830-beee020.jpg] -> [decoded: 354200823-eca73bfc-7194-4a50-9830-beee020.jpg] -> [slashes substituted: 354200823-eca73bfc-7194-4a50-9830-beee020.jpg]
e.i.l.c.c.FilesystemCache - getSourceImageFile(): \var\cache\cantaloupe\source\d5\f7\39\d5f73936f23408fbea6c2d37603e5305
e.i.l.c.s.FilesystemSource - Resolved 354200823-eca73bfc-7194-4a50-9830-beee020.jpg to T:\354200823-eca73bfc-7194-4a50-9830-beee020.jpg
e.i.l.c.p.ProcessorFactory - Failed to initialize TurboJpegProcessor (error: Expecting an absolute path of the library: /opt/libjpeg-turbo/lib/libturbojpeg.so)
e.i.l.c.p.ProcessorFactory - Java2dProcessor selected for format JPEG (AutomaticSelectionStrategy offered TurboJpegProcessor, Java2dProcessor)
e.i.l.c.p.ProcessorConnector - File -> FileProcessor connection between FilesystemSource and Java2dProcessor
e.i.l.c.p.c.j.JPEGImageReader - Using com.sun.imageio.plugins.jpeg.JPEGImageReader
e.i.l.c.c.InfoService - readInfo(): read 354200823-eca73bfc-7194-4a50-9830-beee020.jpg from Java2dProcessor in 12 msec
e.i.l.c.c.InfoService - putInObjectCache(): adding info: 354200823-eca73bfc-7194-4a50-9830-beee020.jpg (new size: 3)
e.i.l.c.c.InfoCache - putInObjectCache(): adding info: 354200823-eca73bfc-7194-4a50-9830-beee020.jpg (new size: 3)
e.i.l.c.r.i.v.ImageResource - Base URI assembled from X-Forwarded headers: http://localhost:8182
e.i.l.c.r.ImageRepresentation - Derivative cache not available; writing directly to the response
e.i.l.c.p.c.j.JPEGImageWriter - ImageIO plugin preferences: com.sun.imageio.plugins.jpeg.JPEGImageWriter
e.i.l.c.p.c.j.JPEGImageWriter - Using com.sun.imageio.plugins.jpeg.JPEGImageWriter
e.i.l.c.p.c.j.JPEGImageReader - Acquiring region 100,100/100x100 from 2781x4100 image
e.i.l.c.p.c.j.JPEGImageWriter - Quality: 80; progressive: true
e.i.l.c.r.ImageRepresentation - Java2dProcessor processed in 242 msec: 354200823-eca73bfc-7194-4a50-9830-beee020.jpg_1:1_cropbypixels:100,100,100,100_scalebypixels:!2604,3839,bicubic_encode:jpg_JPEG_80_interlace_#FFFFFF_8_e95198c48292fa4c8fc0617e850c1f6d
e.i.l.c.r.HandlerServlet - Responded to GET /iiif/3/354200823-eca73bfc-7194-4a50-9830-beee020.jpg/100,100,100,100/max/0/default.jpg with HTTP 200 in 261 msec

There's the suspicious number 2604 in the next-to-last line!

@DiegoPino
Copy link
Contributor

Yes, that is strange. According to the code, max will call scaleByPercent
see

Could you please try enabling your derivative cache? and also force Java2D (manual source selection) to be JPEG processor?
I can't find so far in the code a reason why _scalebypixels:!2604,3839 is called there. Only way that could be is if the cropbypixels failed and the resulting value ends being the original one ....

Need to check more

@DiegoPino
Copy link
Contributor

@mmatela any updates on this? Checking so we keep debugging this or mark as resolved. Thanks

@mmatela
Copy link
Author

mmatela commented Aug 14, 2024

@DiegoPino Oh, I thought you'd be able to run version 5 locally and reproduce the problem, should be more efficient than having this back and forth. Like you said:

I can't try a vanilla 5.0.6 today but will put in my todo list.

@DiegoPino
Copy link
Contributor

@mmatela I have not been successful replicating the issue, maybe because but I don't have a Windows/Java 21 combo to test?
Maybe you could share with me your cantaloupe properties file? (config). If you do so I can at least have the same configs as you and try. I can't also find any reason in the code why max would trigger this. I prefer to test this actually (which I have now here) with a develop build. Thanks

@mmatela
Copy link
Author

mmatela commented Aug 14, 2024

@mmatela I have not been successful replicating the issue, maybe because but I don't have a Windows/Java 21 combo to test?

Like I said, I've seen the same behavior on Linux with Java 11 and Windows with Java 21, so it doesn't seem to be OS/Java dependent. So far you've reported trying a docker with version 6.0.1. Did you try to simply download the latest release and run it with java -Dcantaloupe.config=cantaloupe.properties -Xmx2g -jar cantaloupe-5.0.6.jar?

Maybe you could share with me your cantaloupe properties file? (config). If you do so I can at least have the same configs as you and try.

I just took the sample properties file from the release package and only changed FilesystemSource.BasicLookupStrategy.path_prefix, like I said in the first post.

I can't also find any reason in the code why max would trigger this. I prefer to test this actually (which I have now here) with a develop build. Thanks

OK, but I don't see how trying to run it on my side with different configs will help with that.

@DiegoPino
Copy link
Contributor

@mmatela the issue is we can not fix backwards an existing release. 5.x is done. We can only fix our ongoing/live code base (6.x from the develop branch) and only if the error can be reproduced and we know where/why it is happening. I will give reproducing it another try just for clarity on 5.0.6 but if I can't, then I must assume there is something in your environment/or my own that is different. The fact that this is dependent on a specific file and not a general issue makes all even more complex.

@DiegoPino
Copy link
Contributor

@mmatela update: spent the last 3 hours debugging this. @glenrobson (ping just so you know this is happening)

I ran a vanilla 5.0.6 JAR file on JAVA 22 with your file changing just the filesystem path and I can replicate the issue.
Then transformed your file into 72dpi, TIFF, forced the processor to manual to use JAVA2D and the same. Still scaling by pixel when it should not, and according the code, should be scaling by Percent.

Then retested on a fully functional docker container (will all the dependencies using a fresh develop build) and no issues at all. Returns and processes correctly.

I have so far really no idea, I compared the source classes for size and there are no changes between releases.

The code

Scale toScale(double maxScale) {
if (getPercent() != null) {
return new ScaleByPercent(getPercent() / 100.0);
}
switch (getType()) {
case MAX:
if (maxScale > DELTA) {
return new ScaleByPercent(isUpscalingAllowed() ? maxScale : 1);
} else {
Configuration config = Configuration.getInstance();
final long maxPixels = config.getLong(Key.MAX_PIXELS, 0);
if (maxPixels > 0) {
// Using the square root of max_pixels is not optimal,
// but we don't yet know the source image dimensions in
// order to compute a more accurate size.
final int dims = (int) Math.floor(Math.sqrt(maxPixels));
return new ScaleByPixels(dims, dims, ScaleByPixels.Mode.ASPECT_FIT_INSIDE);
} else {
return new ScaleByPercent(1);
}
}
case ASPECT_FIT_WIDTH:
return new ScaleByPixels(
getWidth(), null, ScaleByPixels.Mode.ASPECT_FIT_WIDTH);
case ASPECT_FIT_HEIGHT:
return new ScaleByPixels(
null, getHeight(), ScaleByPixels.Mode.ASPECT_FIT_HEIGHT);
case ASPECT_FIT_INSIDE:
return new ScaleByPixels(
getWidth(), getHeight(), ScaleByPixels.Mode.ASPECT_FIT_INSIDE);
case NON_ASPECT_FILL:
return new ScaleByPixels(
getWidth(), getHeight(), ScaleByPixels.Mode.NON_ASPECT_FILL);
default:
throw new IllegalArgumentException(
"Unknown scale mode. This is probably a bug.");
}
}

Should return scale by percent because (maxScale being the max_scale property == 1.0) and DELTA == 0.000001

So other maxScale in not being processed or there is a floating point comparison issue? Clueless.

                 return new ScaleByPercent(isUpscalingAllowed() ? maxScale : 1); 
             } else { 
      ... fail and upscale by pixel...

What I will do (tomorrow, 3 hours is my limit) is to do the same test with a develop jar using filesystem and no changes to properties files. This is not a processor issue, it is an argument parsing/constant assignment/comparison issue and might? *have no proof why, but might be even an issue with the actual built. Then if all works I will assume that JAR has issues, if not and develop on such a simplistic setup fails I will add debug statements all over the place to see where parsing MAX, setting max pixels and deciding that scale by percent is no good is happening.

@DiegoPino DiegoPino self-assigned this Aug 14, 2024
@DiegoPino
Copy link
Contributor

@mmatela @glenrobson ok, I got this. This is actually a bug (or an interpretation/math issue). And it still exists in develop. The reason I could not replicate is because none of my cantaloupes (so many) have the default configurations.

Let me explain what I found out:

  • The trigger is max_pixels = 10000000 which is the default. if for the example/failing request/image shared here, you modify this to max_pixels = 40000000, max delivers a percent scaled (1.0) image correctly of 100x100.

  • The actual bug (I believe this is a bug or a miss interpretation of max at least) seems to be related to how the operations/checks and conditionals are chained in specific for max. when the cropped results end in a limbo? of checks based on the actual pixels that a 100x100 square here deliver. I have not yet pinpointed the exact place this happens, but I will try!

@mmatela can you please delete your derivative caches *if you enabled them *, clear your browser cache and modify max_pixels to that other number I shared? And see if 100x100 delivers correctly? Thanks

Debugging where the error(s) is/are and making a pull will take me some time. Some help might be great!

@DiegoPino DiegoPino added the bug label Aug 15, 2024
@DiegoPino DiegoPino added this to the 6.0 milestone Aug 15, 2024
@DiegoPino
Copy link
Contributor

DiegoPino commented Aug 15, 2024

Further debugging. Seems that the issue is happening because of: (for my own reference)

protected void constrainSizeToMaxPixels(Dimension requestedSize,
OperationList opList) {
final var config = Configuration.getInstance();
final int maxPixels = config.getInt(Key.MAX_PIXELS, 0);
if (maxPixels > 0 && requestedSize.intArea() > maxPixels) {
Scale scaleOp = (Scale) opList.getFirst(Scale.class);
// This should be null because the client requested max size...
if (scaleOp != null) {
opList.remove(scaleOp);
}
Dimension scaledSize =
Dimension.ofScaledArea(requestedSize, maxPixels);
// The scale dimensions must be floored because rounding up could
// cause max_pixels to be exceeded.
scaleOp = new ScaleByPixels(
(int) Math.floor(scaledSize.width()),
(int) Math.floor(scaledSize.height()),
ScaleByPixels.Mode.ASPECT_FIT_INSIDE);
if (opList.getFirst(Crop.class) != null) {
opList.addAfter(scaleOp, Crop.class);
} else {
opList.add(0, scaleOp);
}
}
}

@DiegoPino
Copy link
Contributor

And might be also the culprit for
#635

@DiegoPino
Copy link
Contributor

DiegoPino commented Aug 15, 2024

So. I know where this is happening (woho!). There is a chain of calls that happen just after the operation list is built

and invoking then

public void infoAvailable(Info info) {
if (Size.ScaleMode.MAX.equals(params.getSize().getScaleMode())) {
constrainSizeToMaxPixels(info.getSize(), ops);
}
try {
enqueueHeaders(params, info.getSize(pageIndex), disposition);
} catch (IndexOutOfBoundsException e) {
throw new IllegalClientArgumentException(e);
}
}

But the size passed there is unaware of cropping (info.size is the full size of the source). So max is interpreted as a request being done against the full image and any already chained size operation is removed and a new one is appended breaking the contract.

There are two options here:

  • pop/fetch any cropping operation and use the dimensions (if any) of the cropped region inside infoAvailable and call constrainSizeToMaxPixels with those (of the actual image if no cropping) or make the actual check inside constrainSizeToMaxPixels method which already has access to the ops list and is indeed modifying the list to "push" their evil intentions (!width,height) in place.

@jcoyne @ksclarke @glenrobson am I understanding the IIIF Image API specs correctly? max, when in the presence of any cropping (that is what I read so asking again) will deliver the max size allowed for the cropped region, based on the server settings? Right? wrong?

@glenrobson
Copy link
Contributor

Congrats @DiegoPino on finding the issue!

max, when in the presence of any cropping (that is what I read so asking again) will deliver the max size allowed for the cropped region, based on the server settings? Right? wrong?

I think you are correct. max should deliver the full sized image of the crop unless it is large than the server configuered max size (either maxWidth, maxHeight or maxArea). See https://iiif.io/api/image/3.0/#42-size:

"max : The extracted region is returned at the maximum size available, but will not be upscaled. The resulting image will have the pixel dimensions of the extracted region, unless it is constrained to a smaller size by maxWidth, maxHeight, or maxArea as defined in the Technical Properties section."

@DiegoPino
Copy link
Contributor

@mmatela pull #709 is open, tests are passing and MAX is now aware of a previous crop operation, if any.

@DiegoPino
Copy link
Contributor

Resolved via b9d6bae

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants