Skip to content

Refactor HREF normalization and models #358

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 29 commits into from
Nov 20, 2023
Merged

Refactor HREF normalization and models #358

merged 29 commits into from
Nov 20, 2023

Conversation

mickael-menu
Copy link
Member

@mickael-menu mickael-menu commented Nov 10, 2023

Changelog

Changed

  • Many APIs now expect one of the new URL types (RelativeURL, AbsoluteURL, HTTPURL and FileURL). This is helpful because:
    • It validates at compile time that we provide a URL that is supported.
    • The API's capabilities are better documented, e.g. a download API could look like this : download(url: HTTPURL) -> FileURL.

Shared

  • Link and Locator's href are normalized as valid URLs to improve interoperability with the Readium Web toolkits.
    • You MUST migrate your database if you were persisting HREFs and Locators. Take a look at the migration guide for guidance.
  • Links are not resolved to the self URL of a manifest anymore. However, you can still normalize the HREFs yourselves by calling Manifest.normalizeHREFsToSelf().
  • Publication.localizedTitle is now optional, as we cannot guarantee a publication will always have a title (e.g. a CBZ).

Migration guide

Migration of HREFs and Locators (bookmarks, annotations, etc.)

⚠️ This requires a database migration in your application, if you were persisting Locator objects.

In Readium v2.x, a Link or Locator's href could be either:

  • a valid absolute URL for a streamed publication, e.g. https://domain.com/isbn/dir/my%20chapter.html,
  • a percent-decoded path for a local archive such as an EPUB, e.g. /dir/my chapter.html.
    • Note that it was relative to the root of the archive (/).

To improve the interoperability with other Readium toolkits (in particular the Readium Web Toolkits, which only work in a streaming context) Readium v3 now generates and expects valid URLs for Locator and Link's href.

  • https://domain.com/isbn/dir/my%20chapter.html is left unchanged, as it was already a valid URL.
  • /dir/my chapter.html becomes the relative URL path dir/my%20chapter.html
    • We dropped the / prefix to avoid issues when resolving to a base URL.
    • Special characters are percent-encoded.

You must migrate the HREFs or Locators stored in your database when upgrading to Readium 3. To assist you, two helpers are provided: AnyURL(legacyHREF:) and Locator(legacyJSONString:).

Here's an example of a GRDB migration that can serve as inspiration:

migrator.registerMigration("normalizeHREFs") { db in
   let normalizedRows: [(id: Int, href: String, locator: String)] =
       try Row.fetchAll(db, sql: "SELECT id, href, locator FROM bookmarks")
           .compactMap { row in
               guard
                   let normalizedHREF = AnyURL(legacyHREF: row["href"])?.string,
                   let normalizedLocator = try Locator(legacyJSONString: row["locator"])?.jsonString
               else {
                   return nil
               }
               return (row["id"], normalizedHREF, normalizedLocator)
           }

   let updateStmt = try db.makeStatement(sql: "UPDATE bookmarks SET href = :href, locator = :locator WHERE id = :id")
   for (id, href, locator) in normalizedRows {
       try updateStmt.execute(arguments: [
           "id": id,
           "href": href
           "locator": locator
       ])
   }
}

Kotlin PR: readium/kotlin-toolkit#387

@mickael-menu mickael-menu marked this pull request as ready for review November 20, 2023 10:40
@CodiumAI-Agent
Copy link

PR Analysis

  • 🎯 Main theme: Refactoring HREF normalization and models
  • 📝 PR summary: This PR refactors the HREF normalization and models in the codebase. It introduces new URL types (RelativeURL, AbsoluteURL, HTTPURL and FileURL) and modifies the href in Link and Locator to be normalized as valid URLs. It also includes changes to the tests to reflect these modifications.
  • 📌 Type of PR: Refactoring
  • 🧪 Relevant tests added: True
  • ⏱️ Estimated effort to review [1-5]: 4, because the PR includes a significant amount of changes across multiple files, including core components like URL handling and href normalization. The changes are not trivial and require a good understanding of the existing codebase to review effectively.
  • 🔒 Security concerns: No security concerns found

PR Feedback

  • 💡 General suggestions: The changes in this PR are quite extensive and touch on core components of the codebase. It would be beneficial to ensure that these changes are thoroughly tested, not just through unit tests but also through integration tests and manual testing, to ensure that they do not introduce any regressions.

  • 🤖 Code feedback:

    • relevant file: Tests/SharedTests/Publication/ManifestTests.swift
      suggestion: Ensure that the changes to the hrefs in the tests reflect the intended behavior of the new URL types and href normalization. [important]
      relevant line: ["href": "manifest.json", "rel": "self"],

    • relevant file: Sources/Navigator/PDF/PDFNavigatorViewController.swift
      suggestion: It would be beneficial to add error handling for the case where the server fails to remove the endpoint. This could prevent potential issues down the line. [important]
      relevant line: try? server?.remove(at: endpoint)

    • relevant file: Sources/Navigator/PDF/PDFNavigatorViewController.swift
      suggestion: It might be beneficial to add a comment explaining why the setupPDFView method is now unavailable and how the PDFNavigatorDelegate should be used instead. This could help future developers understand the changes. [medium]
      relevant line: @available(*, unavailable, message: "Override PDFNavigatorDelegate instead")

    • relevant file: Sources/Navigator/PDF/PDFNavigatorViewController.swift
      suggestion: Consider adding a comment explaining the reasoning behind the isPDFFile check. This could help future developers understand the code. [medium]
      relevant line: private lazy var isPDFFile: Bool =

How to use

Instructions

Tag me in a comment '@CodiumAI-Agent' and add one of the following commands:
/review: Request a review of your Pull Request.
/describe: Update the PR title and description based on the contents of the PR.
/improve [--extended]: Suggest code improvements. Extended mode provides a higher quality feedback.
/ask <QUESTION>: Ask a question about the PR.
/update_changelog: Update the changelog based on the PR's contents.
/add_docs: Generate docstring for new components introduced in the PR.
/generate_labels: Generate labels for the PR based on the PR's contents.
see the tools guide for more details.

To edit any configuration parameter from the configuration.toml, add --config_path=new_value.
For example: /review --pr_reviewer.extra_instructions="focus on the file: ..."
To list the possible configuration parameters, add a /config comment.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants