Skip to content

Conversation

@chaban-mb
Copy link

Code and description are AI-generated

This PR fixes issues when importing releases from the Internet Archive (Wayback Machine), specifically targeting Chrome compatibility and URL sanitization. #466

Note

For this to work reliably in Chrome with Tampermonkey the Content Script API needs to be set to UserScripts API Dynamic

The Problem

  1. Chrome Compatibility: The script previously relied on the beforescriptexecute event to patch the Wayback Machine's rewriter (wombat.js). This event is specific to Firefox and was removed from the HTML5 spec, meaning the patch failed silently on Chrome/Chromium, causing the "Import" button to break on archived pages.
  2. Dirty URLs: When importing from an archive, the generated MusicBrainz data often retained the web.archive.org prefix in the Release URL, Label URL, and License URL (e.g., https://web.archive.org/web/.../https://...).

The Solution

  1. Wombat Patch
    Replaced the Firefox-only beforescriptexecute listener with an Object.defineProperty hook. This intercepts the creation of the global _WBWombat object, allowing us to inject no_rewrite_prefixes (excluding MusicBrainz URLs) safely on all browsers, including Chrome, before the archiver initializes.

  2. URL Cleaning

    • Updated String.prototype.fix_bandcamp_url with a regex to detect and strip Wayback Machine prefixes (/web/YYYYMMDD.../), ensuring all imported URLs are "clean" and use HTTPS.
    • Applied fix_bandcamp_url() to the License (Creative Commons) link and Label back-links, which were previously extracting the raw, "dirty" DOM attributes.

Changes

  • Updated fix_bandcamp_url to strip archive prefixes.
  • Applied URL cleaning to ccIcons (License) and labelbacklink extraction.
  • Replaced the window.location.hostname === 'web.archive.org' logic block with the new Object.defineProperty implementation.

Testing

  • Tested on Chrome (Tampermonkey) via Wayback Machine.
  • Verified that the "Import" button successfully opens the MusicBrainz editor.
  • Verified that the Release URL, Label URL, and License URL in the editor are the original Bandcamp URLs, not Archive.org links.

…g URLs

- Replace Firefox-specific `beforescriptexecute` listener with `Object.defineProperty` to correctly patch `_WBWombat` on Chrome and Chromium-based browsers.
- Update `fix_bandcamp_url` with regex to detect and remove Wayback Machine prefixes (e.g. `web.archive.org/...`) from release, label, and license URLs.
- Ensure the "Import" button redirects correctly and that all imported metadata uses clean, original Bandcamp URLs instead of archive links.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant