Skip to content

ETT-686: refactor download; add matomo tags and error handling#184

Open
carylwyatt wants to merge 10 commits intomainfrom
download-with-fetch
Open

ETT-686: refactor download; add matomo tags and error handling#184
carylwyatt wants to merge 10 commits intomainfrom
download-with-fetch

Conversation

@carylwyatt
Copy link
Member

@carylwyatt carylwyatt commented Feb 3, 2026

Ironically, this download-with-fetch branch doesn't actually use the fetch API anywhere after we made the decision to "just use a link" for simple downloads.

The tests that are failing are imgsrv related. Some of the requests I wrote return a 403/Restricted, and I can't figure out why.

It occasionally happens in the browser when I try to download that scan as a full-resolution TIFF:
image
And the related request in my docker apache:

2026-02-19 14:43:50 127.0.0.1 - - [19/Feb/2026:19:43:50 +0000] "GET / HTTP/1.1" 200 9746 "-" "curl/8.14.1"
2026-02-19 14:43:50 172.18.0.1 - - [19/Feb/2026:19:43:50 +0000] "GET /cgi/imgsrv/download/image?id=test.pd_open&attachment=1&tracker=D3&format=image%2Ftiff&target_ppi=0&seq=2 HTTP/1.1" 403 336 "http://localhost:8080/cgi/pt?id=test.pd_open&seq=2" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/144.0.0.0 Safari/537.36"

After I added the download attribute to the download link, it stopped redirecting when I got the restricted error, and started downloading... nothing. In back-to-back attempts at an identical download of that scan, one was successful and one was not 🤷‍♀️
image

I haven't run in to this on any other volume, so I'm wondering if this is a problem with the version of imgsrv in docker? It can't just be my machine if these tests are also failing in github actions.

Following up on "simple" downloads that don't actually download

image

While I was testing the matomo tag for download errors on dev-3, I realized that there is no way to catch errors from broken links. The user makes a selection, clicks "Download," the download modal appears with the second "Download" button. They click the button, it looks like something downloaded... but it's just errors. Do we want to try to mitigate this by reverting back to fetch? It doesn't feel right to assume that as long as the link is correct that imgsrv will respond correctly 100% of the time.

Testing

This is staged on dev-3. Gayathri already approved the UI/a11y changes, but there are multiple versions of downloads you could test:

All the "simple download" options that should give you a download modal without a progress bar:

  1. single-page PDF/JPEG/TIFF/txt
  2. PDF/JPEG/txt "selected scans" of 10 or fewer pages
  3. PDF/JPEG/TIFF/txt "current scan"

everything else should trigger the "build" version of the download modal with the progress ba

  1. whole items
  2. more than 1 selected TIFF
  3. more than 10 selected pages of anything else

Conclusion

  1. what's up with imgsrv tests? maybe pull this down and run tests locally (bin/run_playwright_tests.sh all) to see if this is just a me problem
  2. do we care about not catching errors from "just use a link" method for simple downloads?
  3. please test on dev-3

@carylwyatt carylwyatt force-pushed the download-with-fetch branch 2 times, most recently from 31d0210 to 74717c5 Compare February 19, 2026 20:12
const HT = getContext('HT');

const formatTitle = {};
const formatTitle = $state({});
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I updated this component to svelte 5 syntax, so any variable that needs to be reactive is declared with the $state() rune.

let flattenedSelection = $state([]);
let clearSelectionLabel = $state('Clear selection');

let emptySelection = $derived(range == 'selected-pages' && selection.pages.length == 0);
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I extracted the error messages and their conditionals out of the submission logic so they could be reactive and update the UI as soon as a condition is met (instead of waiting for the "Download" button to be clicked).


let allowDownload = manifest.allowSinglePageDownload || manifest.allowFullDownload;

const _mtm = (window._mtm = window._mtm || []);
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added matomo to this component.


const _mtm = (window._mtm = window._mtm || []);

let simpleUrl = $derived.by(() => {
Copy link
Member Author

@carylwyatt carylwyatt Feb 19, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was once a tangled mess of conditionals inside submitDownload() and now it's a $derived variable that updates whenever any of the reactive/$state() variables change. The conditionals were extracted to their own function, all this does is construct a URL and return it.

status = status;
}

function trackInterval() {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since we don't really care about the tracker cookie, I removed this tracker/interval altogether.

}
}

function finalizeDownload() {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This function was never called. 🤷‍♀️


let scriptEl = tunnelWindow.document.createElement('script');
scriptEl.type = 'module';
let scriptEl = document.createElement('script');
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the downloads that use the JSONP callback from imgsrv, we don't use an iframe for that request anymore, but we do use a script tag that gets appended to the <body>. It's doing essentially the same thing as before, but without with the outdated and sort-of-shady hidden iframe.


if (selection.pages.length > 0) {
selection.seq = selection.pages;
function isSimpleDownload() {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here are the submission conditionals that I extracted from submitDownload().

// start the download in the iframe
let scriptEl = tunnelWindow.document.createElement('script');
scriptEl.type = 'module';
function buildCallbackDownloadUrl() {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the other half of the logic that was originally in submitDownload().

action += 'image';
sizeAttr = 'size';
sizeValue = targetPPI == '0' ? 'full' : `ppi:${targetPPI}`;
function calculateSelection() {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

More conditionals that were extracted from submitDownload() and refactored a bit with reactivity in mind.


function isPartialDownload() {
return range == 'selected-pages' || range.startsWith('current-page');
function submitDownload(e) {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now this function that was over 100 lines is just 24 lines long 🤩 (and most of that is making sure the spinner on the download button stops spinning after the Save As window disappears/download is finished)

range = 'selected-pages';
}
$: meta = manifest.meta($currentSeq);
$effect(() => {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are a lot of new $effect()s in this component. In Svelte 5, the $effect rune is reserved for state mutations made to reactive variables. If any reactive $state() variables referenced inside the $effect change, then the whole effect block runs. This can be handy for this kind of tricky UI/state management that we do in the download form. We have so many variations of which form inputs need to be enabled/disabled, appear/disappear, change their wording/value based on the volume itself, the user's role or permissions, and what the user clicks on in the form itself, which view the user selects... it's diabolical.

Some of these effects were updated from the svelte 4 magic $ syntax (which you can see in the red side of the diff), but I added 4 new ones to handle all the state/UI updates necessary for a reactive form.

Generally, svelte suggests not mutating state inside an $effect as it can lead to complicated state management (yes, indeed), but one of the things the docs suggests to use it for is DOM manipulation, which is what we're doing here.

<label class="form-check-label" for="format-pdf"> Ebook (PDF) </label>
</div>
{#if manifest.allowFullDownload}
{#snippet icon()}
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Svelte 5 swaps slots for snippets.

</p>
{/if}
</form>
<form class="d-none" bind:this={tunnelForm} method="GET" {action} target="download-module-xxx">
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Goodbye hidden form!

<!-- do something else -->
{/if}
</form>
<iframe
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Goodbye hidden iframe!

Building your {formatTitle[format]}
{#if $selected.size > 0}
({$selected.size} page{$selected.size > 1 ? 's' : ''})
{#if simpleDownload}
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added some template conditionals for different wording depending on simple/callback download.

await page.getByLabel('Current page scan (#1)').check();
await downloadButton.click();

const modalHeading = page.getByRole('heading', { name: 'Download your Image (JPEG)' });
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now that all downloads see the modal, I added the modal portion of the download to all existing playwright tests.

expect(downloadHeaders['content-type']).toEqual('image/jpeg');
expect(downloadBody.length).toBeGreaterThan(1);
});
test('download single selected page tiff, high resolution', async ({ request, page }) => {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added some new imgsrv tests after getting that 403 error for single selection, full-resolution TIFF. I'm still baffled by why this is failing.

@carylwyatt carylwyatt changed the title ETT-686: download with fetch, add error handling ETT-686: refactor download; add matomo tags and error handling Feb 20, 2026
@carylwyatt carylwyatt force-pushed the download-with-fetch branch from bf664e1 to cb2d416 Compare March 3, 2026 21:19
@carylwyatt carylwyatt marked this pull request as ready for review March 3, 2026 21:31
@carylwyatt carylwyatt requested a review from moseshll March 3, 2026 21:31
@carylwyatt carylwyatt force-pushed the download-with-fetch branch from cb2d416 to 2ae94f8 Compare March 3, 2026 21:54
@moseshll
Copy link
Contributor

moseshll commented Mar 4, 2026

@carylwyatt I'm not sure if the "tests/sidebar.spec.js:105:5 › sidebar actions › download scans › download selected scans as tiff" failures are a work in progress but they were the low hanging fruit here. The "TIFF downloads are limited to 10 page scans" notice appears to be removed, hence the failure. Commenting out await expect(page.getByText('Note: TIFF downloads are limited')).toBeVisible(); fixes that set of failures.

On to the trickier one...

@moseshll
Copy link
Contributor

moseshll commented Mar 4, 2026

@carylwyatt The "Restricted" message is originating with imgsrv/lib/SRV/Volume/Image/Bundle.pm in the _authorize routine: the specific condition is if ( $$ses{is_new} && ! $ENV{XYZZY} ) -- there was no existing session.

Does this represent a way forward? Ummm, not sure. I am tremendously ignorant of session management under Playwright.

sub _authorize {
my $self = shift;
my $env = shift;
$self->SUPER::_authorize($env);
unless ( $self->restricted ) {
# technically the user has access but we need to
# limit resources for bundling to users in a current session
# unless you're using XYZZY=1 on the command line
my $C = $$env{'psgix.context'};
my $ses = $C->get_object('Session');
if ( $$ses{is_new} && ! $ENV{XYZZY} ) { $self->restricted(1); }
elsif ( $self->format eq 'image/tiff' && $self->total_pages > 10 ) {
$self->restricted(1);
}
}
}

@carylwyatt
Copy link
Member Author

carylwyatt commented Mar 4, 2026

there was no existing session.

@moseshll What's so weird about this test failure is that it starts exactly like all the other imgsrv tests by going to the browser and accepting the cookies via the cookie banner, which should successfully set up the session. It then makes a GET request to imgsrv and checks the response.

I opened the playwright error trace to double check it is indeed going to pageturner and accepting the cookies, so why does imgsrv think there is no session, but only for this one test and not the others? It's so consistent and very weird!

As for the other test failure, I cleaned up the other tests that referenced that TIFF error but must've forgotten one! Thanks!

@carylwyatt
Copy link
Member Author

carylwyatt commented Mar 4, 2026

@moseshll I used claude to help me understand the perl from that routine (thanks for the breadcrumb!), and I see now that it's expecting to not have a new/not-empty session. What fixed the tests was to change how the request was made. Playwright has a browser context and an HTTP context. The cookie banner click that accepts the cookies/creates new cookies is in the browser context, but the imgsrv request was being made with the HTTP context. I updated the imgsrv requests to use the page/browser context where the cookies were accepted, and this whole issue went away. I updated the other tests that I added to diagnose this problem, and they all pass now, too. 🥳

@moseshll
Copy link
Contributor

moseshll commented Mar 4, 2026

@carylwyatt I was about to mention a discovery to you, and then I got caught up in the whole Jira API token discussion with Ryan... yes I had discovered that if you commented out the test.beforeEach it had no effect, so I was muddling along on sorta the right track. Glad you got the tests sorted. I'll have more time to devote to following up on the changes tomorrow.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants