Skip to content

Fix saxon configuration issue in dspace oai module#1041

Merged
milanmajchrak merged 9 commits intodtq-devfrom
fix-saxon-configuration-issue-in-dspace-oai-module
Aug 18, 2025
Merged

Fix saxon configuration issue in dspace oai module#1041
milanmajchrak merged 9 commits intodtq-devfrom
fix-saxon-configuration-issue-in-dspace-oai-module

Conversation

@Paurikova2
Copy link
Collaborator

@Paurikova2 Paurikova2 commented Aug 14, 2025

Phases MP MM MB MR JM Total
ETA 3 0 0 0 0 0
Developing 16 0 0 0 0 0
Review 0 0 0 0 0 0
Total - - - - - 0
ETA est. 0
ETA cust. - - - - - 0

Problem description

After migration DSpace from version 7.6.1 top 7.6.5 there is error:
Error: I migrated Dspace from version 7.6.1 to 7.6.5 and in docker i got error> 2025-08-13 06:33:14,849 http-nio-8080-exec-141 ERROR unknown 2dea7e02-70e3-4eb6-b72d-0702df93d559 org.dspace.xoai.controller.DSpaceOAIDataProvider @ Node returned by extension function was created with an incompatible Configuration
com.lyncode.xoai.dataprovider.exceptions.OAIException: Node returned by extension function was created with an incompatible Configuration
at com.lyncode.xoai.dataprovider.handlers.GetRecordHandler.handle(GetRecordHandler.java:72)
at com.lyncode.xoai.dataprovider.OAIDataProvider.handle(OAIDataProvider.java:207)
at org.dspace.xoai.controller.DSpaceOAIDataProvider.contextAction(DSpaceOAIDataProvider.java:131)
at jdk.internal.reflect.GeneratedMethodAccessor636.invoke(Unknown Source)
at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.base/java.lang.reflect.Method.invoke(Method.java:566)
at org.springframework.web.method.support.InvocableHandlerMethod.doInvoke(InvocableHandlerMethod.java:205)
at
.....

Summary by CodeRabbit

  • Bug Fixes

    • XSLT/XDM functions are more resilient: nulls and per-parameter errors now produce empty results and no longer break processing.
  • Refactor

    • XML/XSLT processing consolidated on a shared processor for consistent extension support and safer document handling.
    • Improved debug/warning/error logging for clearer diagnostics and fault tolerance.
  • Chores

    • Build script updated to deploy the OAI module version 7.6.5.

@coderabbitai
Copy link

coderabbitai bot commented Aug 14, 2025

Note

Other AI code review bot(s) detected

CodeRabbit has detected other AI code review bot(s) in this pull request and will avoid duplicating their findings in the review comments. This may lead to a less comprehensive review.

Walkthrough

Adds a SharedSaxonProcessor for centralized Saxon Processor and TransformerFactory initialization and access; updates DSpaceResourceResolver to initialize and use it; refactors ListXslFunction, NodeListXslFunction, and NodeXslFunction for safer parameter handling, logging, and Xdm-based results; bumps a build script JAR version.

Changes

Cohort / File(s) Summary
Shared Saxon infrastructure
dspace-oai/src/main/java/org/dspace/xoai/services/impl/resources/SharedSaxonProcessor.java, dspace-oai/src/main/java/org/dspace/xoai/services/impl/resources/DSpaceResourceResolver.java
Adds SharedSaxonProcessor with synchronized initialize(TransformerFactory), cached Processor and SaxonTransformerFactory, and accessors; updates DSpaceResourceResolver to call SharedSaxonProcessor.initialize(...), register extension functions via SharedSaxonProcessor.getProcessor(), and create Templates via SharedSaxonProcessor.getTransformerFactory().newTemplates(...) (removes direct SaxonTransformerFactory import/usage).
List function hardening & logging
dspace-oai/src/main/java/org/dspace/xoai/services/impl/resources/functions/ListXslFunction.java
Adds Log4j logger, stronger null/empty checks, per-parameter try/catch with warnings, StringBuilder accumulation skipping nulls, debug summary logging; method signature and annotations standardized.
Node list returns Xdm sequence
dspace-oai/src/main/java/org/dspace/xoai/services/impl/resources/functions/NodeListXslFunction.java
Replaces DOM assembly with stream-based XdmValue sequence of XdmAtomicValue items; returns XdmEmptySequence on empty/error; tightens input checks and logging; removes DOM imports; updates method declaration order/visibility.
Node function uses shared builder
dspace-oai/src/main/java/org/dspace/xoai/services/impl/resources/functions/NodeXslFunction.java
Builds XdmNode via SharedSaxonProcessor.getProcessor().newDocumentBuilder().build(new DOMSource(node)); returns XdmEmptySequence on empty/error; changes result occurrence to ZERO_OR_ONE; adds logging; removes per-call DocumentBuilderFactory usage; updates method declaration order/visibility.
Build script jar bump
scripts/fast-build/oai-pmh-package-update.bat
Changes copied JAR from dspace-oai-7.6.1.jar to dspace-oai-7.6.5.jar.

Sequence Diagram(s)

sequenceDiagram
  participant Resolver as DSpaceResourceResolver
  participant Shared as SharedSaxonProcessor
  participant Ext as ExtensionFunctionRegistry

  Resolver->>Shared: initialize(transformerFactory)
  Shared-->>Resolver: Processor & TransformerFactory ready
  loop register extensions
    Resolver->>Shared: getProcessor()
    Shared-->>Resolver: Processor
    Resolver->>Ext: registerExtensionFunction(...) using Processor
  end
  Resolver->>Shared: getTransformerFactory().newTemplates(src)
  Shared-->>Resolver: Templates
Loading
sequenceDiagram
  participant XSLT as XSLT Runtime
  participant NodeF as NodeXslFunction
  participant Shared as SharedSaxonProcessor

  XSLT->>NodeF: call(params)
  alt params empty/invalid
    NodeF-->>XSLT: XdmEmptySequence
  else valid
    NodeF->>Shared: getProcessor() -> newDocumentBuilder().build(DOMSource)
    Shared-->>NodeF: XdmNode
    NodeF-->>XSLT: XdmNode
  end
Loading
sequenceDiagram
  participant XSLT as XSLT Runtime
  participant ListF as NodeListXslFunction

  XSLT->>ListF: call(param)
  alt no items / error
    ListF-->>XSLT: XdmEmptySequence
  else items found
    ListF-->>XSLT: XdmValue (sequence of strings)
  end
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~40 minutes

Poem

I nibble through code with whiskered grace,
One SharedSaxon burrow finds its place.
Lists and nodes now hop safe and sound,
Empty sequences when no carrots found.
The jar hops up — 7.6.5 is crowned. 🐇


📜 Recent review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

💡 Knowledge Base configuration:

  • MCP integration is disabled by default for public repositories
  • Jira integration is disabled by default for public repositories
  • Linear integration is disabled by default for public repositories

You can enable these settings in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between 9d33d55 and f7e1505.

📒 Files selected for processing (2)
  • dspace-oai/src/main/java/org/dspace/xoai/services/impl/resources/SharedSaxonProcessor.java (1 hunks)
  • dspace-oai/src/main/java/org/dspace/xoai/services/impl/resources/functions/ListXslFunction.java (3 hunks)
🚧 Files skipped from review as they are similar to previous changes (2)
  • dspace-oai/src/main/java/org/dspace/xoai/services/impl/resources/functions/ListXslFunction.java
  • dspace-oai/src/main/java/org/dspace/xoai/services/impl/resources/SharedSaxonProcessor.java
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (5)
  • GitHub Check: dspace / docker-build (linux/amd64, ubuntu-latest, true)
  • GitHub Check: dspace-cli / docker-build (linux/amd64, ubuntu-latest, true)
  • GitHub Check: dspace-test / docker-build (linux/amd64, ubuntu-latest, true)
  • GitHub Check: Run Integration Tests
  • GitHub Check: Run Unit Tests
✨ Finishing Touches
  • 📝 Generate Docstrings
🧪 Generate unit tests
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch fix-saxon-configuration-issue-in-dspace-oai-module

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.

Support

Need help? Create a ticket on our support page for assistance with any issues or questions.

CodeRabbit Commands (Invoked using PR/Issue comments)

Type @coderabbitai help to get the list of available commands.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Status, Documentation and Community

  • Visit our Status Page to check the current availability of CodeRabbit.
  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 6

🧹 Nitpick comments (3)
dspace-oai/src/main/java/org/dspace/xoai/services/impl/resources/SharedSaxonProcessor.java (1)

24-26: Consider making fields final after initialization

While the current implementation ensures thread-safe initialization through synchronization, the static fields could benefit from additional protection against accidental modification.

Consider using a holder pattern or volatile fields with double-checked locking for better thread safety:

-    private static SaxonTransformerFactory saxonTransformerFactory;
-    private static Processor sharedProcessor;
-    private static DocumentBuilder sharedDocumentBuilder;
+    private static volatile SaxonTransformerFactory saxonTransformerFactory;
+    private static volatile Processor sharedProcessor;
+    private static volatile DocumentBuilder sharedDocumentBuilder;
dspace-oai/src/main/java/org/dspace/xoai/services/impl/resources/DSpaceResourceResolver.java (1)

48-50: Consider handling initialization failures

The static initializer could fail if the transformer factory is not Saxon-compatible. Consider adding error handling.

     static {
-        // Initialize the shared processor first
-        SharedSaxonProcessor.initialize(transformerFactory);
+        // Initialize the shared processor first
+        try {
+            SharedSaxonProcessor.initialize(transformerFactory);
+        } catch (Exception e) {
+            throw new ExceptionInInitializerError("Failed to initialize SharedSaxonProcessor: " + e.getMessage());
+        }
dspace-oai/src/main/java/org/dspace/xoai/services/impl/resources/functions/NodeXslFunction.java (1)

56-65: Tighten result/argument types to reflect actual behavior (optional)

This function returns at most one node (or empty). Consider narrowing the types to improve static checking in Saxon.

     @Override
     final public SequenceType getResultType() {
-        return SequenceType.makeSequenceType(ItemType.ANY_NODE, OccurrenceIndicator.ZERO_OR_MORE);
+        return SequenceType.makeSequenceType(ItemType.ANY_NODE, OccurrenceIndicator.ZERO_OR_ONE);
     }
@@
     @Override
     final public SequenceType[] getArgumentTypes() {
         return new SequenceType[]{
                 SequenceType.makeSequenceType(
-                        ItemType.STRING, OccurrenceIndicator.ZERO_OR_MORE)};
+                        ItemType.STRING, OccurrenceIndicator.ZERO_OR_ONE)};
     }
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

💡 Knowledge Base configuration:

  • MCP integration is disabled by default for public repositories
  • Jira integration is disabled by default for public repositories
  • Linear integration is disabled by default for public repositories

You can enable these settings in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between 2b093a0 and 2de4cbc.

📒 Files selected for processing (6)
  • dspace-oai/src/main/java/org/dspace/xoai/services/impl/resources/DSpaceResourceResolver.java (3 hunks)
  • dspace-oai/src/main/java/org/dspace/xoai/services/impl/resources/SharedSaxonProcessor.java (1 hunks)
  • dspace-oai/src/main/java/org/dspace/xoai/services/impl/resources/functions/ListXslFunction.java (3 hunks)
  • dspace-oai/src/main/java/org/dspace/xoai/services/impl/resources/functions/NodeListXslFunction.java (3 hunks)
  • dspace-oai/src/main/java/org/dspace/xoai/services/impl/resources/functions/NodeXslFunction.java (3 hunks)
  • scripts/fast-build/oai-pmh-package-update.bat (1 hunks)
🧰 Additional context used
🧬 Code Graph Analysis (2)
dspace-oai/src/main/java/org/dspace/xoai/services/impl/resources/DSpaceResourceResolver.java (1)
dspace-oai/src/main/java/org/dspace/xoai/services/impl/resources/SharedSaxonProcessor.java (1)
  • SharedSaxonProcessor (22-74)
dspace-oai/src/main/java/org/dspace/xoai/services/impl/resources/functions/NodeXslFunction.java (1)
dspace-oai/src/main/java/org/dspace/xoai/services/impl/resources/SharedSaxonProcessor.java (1)
  • SharedSaxonProcessor (22-74)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (5)
  • GitHub Check: dspace-cli / docker-build (linux/amd64, ubuntu-latest, true)
  • GitHub Check: dspace-test / docker-build (linux/amd64, ubuntu-latest, true)
  • GitHub Check: dspace / docker-build (linux/amd64, ubuntu-latest, true)
  • GitHub Check: Run Integration Tests
  • GitHub Check: Run Unit Tests
🔇 Additional comments (12)
scripts/fast-build/oai-pmh-package-update.bat (1)

11-11: POMs updated to 7.6.5 — please verify the built JAR exists

Short check: root and module POMs were updated to 7.6.5 (pom.xml, dspace/modules/pom.xml, dspace-oai/pom.xml, dspace/pom.xml, dspace-server-webapp/pom.xml); no 7.6.1 references found. The script still expects the dspace-oai JAR — confirm the build produces it.

Files to verify:

  • scripts/fast-build/oai-pmh-package-update.bat — line copying the JAR:
xcopy /e /h /i /q /y %dspace_source%\dspace-oai\target\dspace-oai-7.6.5.jar %tomcat%\webapps\server\WEB-INF\lib\
  • pom.xml (root) and module poms listed above — already show 7.6.5.

Action required: run the build and confirm dspace-oai/target/dspace-oai-7.6.5.jar exists. If the produced artifact has a different finalName, update the batch script accordingly.

dspace-oai/src/main/java/org/dspace/xoai/services/impl/resources/functions/ListXslFunction.java (3)

23-23: LGTM! Good addition of logging infrastructure

The addition of the logger is appropriate for debugging XSL function execution.

Also applies to: 32-32


57-60: Good defensive programming with null checks

The null/empty parameter handling with appropriate debug logging is a solid improvement.


62-85: Excellent error handling and logging improvements

The refactored implementation with StringBuilder and multi-level try-catch blocks provides:

  • Better performance than string concatenation
  • Resilient per-parameter error handling that doesn't abort the entire operation
  • Comprehensive logging at appropriate levels (debug, warn, error)
  • Graceful failure with empty string return instead of propagating exceptions

This is a significant improvement in robustness for XSLT transformations.

dspace-oai/src/main/java/org/dspace/xoai/services/impl/resources/SharedSaxonProcessor.java (1)

46-51: Good defensive programming with state validation

The consistent use of IllegalStateException with clear error messages for uninitialized access is excellent.

Also applies to: 57-62, 68-73

dspace-oai/src/main/java/org/dspace/xoai/services/impl/resources/functions/NodeListXslFunction.java (3)

26-26: Good use of XdmEmptySequence for empty results

The addition of XdmEmptySequence import aligns well with the new error handling approach.


49-52: Proper result type declaration

The getResultType() method correctly declares the function returns zero or more string items, which matches the implementation.


63-100: Excellent refactoring to XDM-based approach

The transformation from DOM-based document construction to XDM value sequences is a significant improvement:

  • More efficient memory usage
  • Better Saxon API alignment
  • Comprehensive error handling with appropriate logging
  • Consistent use of XdmEmptySequence for empty/error cases
  • Null-safe item filtering

This aligns perfectly with the centralized Saxon processor architecture.

dspace-oai/src/main/java/org/dspace/xoai/services/impl/resources/DSpaceResourceResolver.java (3)

48-50: Good integration with SharedSaxonProcessor

The initialization of SharedSaxonProcessor in the static block ensures it's ready before extension function registration.


63-66: Proper use of shared processor for extension registration

Using SharedSaxonProcessor.getProcessor().registerExtensionFunction() ensures all extension functions are registered with the same Saxon processor instance, preventing configuration conflicts.


91-94: Consistent use of shared transformer factory

Using SharedSaxonProcessor.getTransformerFactory().newTemplates() ensures templates are created with the same Saxon configuration that has the extension functions registered.

dspace-oai/src/main/java/org/dspace/xoai/services/impl/resources/functions/NodeXslFunction.java (1)

105-110: Robust error handling and graceful fallback

Catching and logging errors, then returning XdmEmptySequence to keep XSLT processing resilient, looks good.

@Paurikova2 Paurikova2 requested a review from Copilot August 14, 2025 09:53

This comment was marked as outdated.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (3)
dspace-oai/src/main/java/org/dspace/xoai/services/impl/resources/functions/NodeXslFunction.java (3)

22-28: Fix import order (build is failing) and remove BouncyCastle Arrays; use JDK Arrays

  • Checkstyle failure: XdmEmptySequence must be imported before XdmValue.
  • Remove org.bouncycastle.util.Arrays (nonstandard here) and use java.util.Arrays instead.

Apply:

-import java.util.Objects;
+import java.util.Objects;
+import java.util.Arrays;
-import net.sf.saxon.s9api.XdmNode;
-import net.sf.saxon.s9api.XdmValue;
-import net.sf.saxon.s9api.XdmEmptySequence;
+import net.sf.saxon.s9api.XdmEmptySequence;
+import net.sf.saxon.s9api.XdmNode;
+import net.sf.saxon.s9api.XdmValue;
-import org.bouncycastle.util.Arrays;

61-65: Harden argument validation; handle empty args and drop BC dependency

Current check can NPE on empty args and relies on BC’s Arrays. Use JDK and explicitly handle empty/zero-length inputs.

-        if (Objects.isNull(xdmValues) || Arrays.isNullOrContainsNull(xdmValues)) {
+        if (xdmValues == null
+                || xdmValues.length == 0
+                || Arrays.stream(xdmValues).anyMatch(Objects::isNull)
+                || xdmValues[0].size() == 0) {
             log.debug("Null or empty parameters passed to {}, returning empty sequence", getFnName());
             return XdmEmptySequence.getInstance();
         }

84-89: Do not reuse a shared Saxon DocumentBuilder (not thread-safe); create a fresh builder per call

Saxon’s s9api DocumentBuilder is not thread-safe. Using SharedSaxonProcessor.getDocumentBuilder() (which returns a shared instance) can trigger race conditions under concurrent XSLT calls.

-            // Use the shared document builder to ensure configuration compatibility
-            // and wrap in try-catch to handle any remaining configuration issues
-            XdmNode xdmNode = SharedSaxonProcessor.getDocumentBuilder().build(new DOMSource(node));
-            log.debug("Function {} successfully processed parameter '{}' and returned node", getFnName(), val);
+            // Create a fresh DocumentBuilder per call; Saxon's DocumentBuilder is not thread-safe
+            XdmNode xdmNode = SharedSaxonProcessor.getProcessor()
+                .newDocumentBuilder()
+                .build(new DOMSource(node));
+            log.debug(
+                "Function {} successfully processed parameter '{}' and returned node",
+                getFnName(), val);
             return xdmNode;

Follow-up (outside this file): consider changing SharedSaxonProcessor to return a new DocumentBuilder each time (or expose a factory) to make this pattern standard across call sites.

🧹 Nitpick comments (2)
dspace-oai/src/main/java/org/dspace/xoai/services/impl/resources/functions/NodeXslFunction.java (2)

67-74: Avoid exception-driven control flow for parameter extraction

Given the strengthened precondition, you can simplify and avoid try/catch here.

-        String val;
-        try {
-            val = xdmValues[0].itemAt(0).getStringValue();
-        } catch (Exception e) {
-            // e.g. when no parameter is passed and xdmValues[0] ends with index error
-            log.warn("Empty value in call of function {}, returning empty sequence", getFnName());
-            return XdmEmptySequence.getInstance();
-        }
+        // At this point, xdmValues[0] is guaranteed non-null and non-empty by the guard above
+        String val = xdmValues[0].itemAt(0).getStringValue();

90-95: Attach exception to error log and drop redundant debug

Logging the exception directly on the error entry captures the stack trace without needing a separate debug call.

-        } catch (Exception e) {
-            log.error("Error in function {} processing parameter '{}': {}", getFnName(), val, e.getMessage());
-            log.debug("Full stack trace for function {} error:", getFnName(), e);
-            // Return empty sequence on any error to prevent XSLT processing failure
-            return XdmEmptySequence.getInstance();
-        }
+        } catch (Exception e) {
+            log.error("Error in function {} processing parameter '{}'", getFnName(), val, e);
+            // Return empty sequence on any error to prevent XSLT processing failure
+            return XdmEmptySequence.getInstance();
+        }
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

💡 Knowledge Base configuration:

  • MCP integration is disabled by default for public repositories
  • Jira integration is disabled by default for public repositories
  • Linear integration is disabled by default for public repositories

You can enable these settings in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between 2de4cbc and 23a0781.

📒 Files selected for processing (1)
  • dspace-oai/src/main/java/org/dspace/xoai/services/impl/resources/functions/NodeXslFunction.java (3 hunks)
🧰 Additional context used
🧬 Code Graph Analysis (1)
dspace-oai/src/main/java/org/dspace/xoai/services/impl/resources/functions/NodeXslFunction.java (1)
dspace-oai/src/main/java/org/dspace/xoai/services/impl/resources/SharedSaxonProcessor.java (1)
  • SharedSaxonProcessor (22-74)
🪛 GitHub Actions: Build
dspace-oai/src/main/java/org/dspace/xoai/services/impl/resources/functions/NodeXslFunction.java

[error] 24-24: CustomImportOrder: Wrong lexicographical order for 'net.sf.saxon.s9api.XdmEmptySequence' import. Should be before 'net.sf.saxon.s9api.XdmValue'.


[error] 80-80: LineLength: Line is longer than 120 characters.

@Paurikova2 Paurikova2 requested a review from Copilot August 14, 2025 11:04

This comment was marked as outdated.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

♻️ Duplicate comments (3)
dspace-oai/src/main/java/org/dspace/xoai/services/impl/resources/SharedSaxonProcessor.java (1)

36-38: Type validation on TransformerFactory — good addition.

The instanceof check with a clear IllegalArgumentException resolves the prior risk of ClassCastException during initialization.

dspace-oai/src/main/java/org/dspace/xoai/services/impl/resources/functions/NodeXslFunction.java (2)

14-14: Replaced BC Arrays with JDK Arrays and hardened null/empty checks — looks good.

Using java.util.Arrays with a proper null/empty guard is the right move and removes an unnecessary dependency.

Also applies to: 62-65


80-81: Fix line-length violation in debug log (break into multiple lines).

This exceeds the 120-char limit and will break Checkstyle. Split the call across lines.

-                log.debug("Function {} returned null node for parameter '{}', returning empty sequence", getFnName(), val);
+                log.debug(
+                    "Function {} returned null node for parameter '{}', returning empty sequence",
+                    getFnName(), val
+                );
🧹 Nitpick comments (4)
dspace-oai/src/main/java/org/dspace/xoai/services/impl/resources/SharedSaxonProcessor.java (1)

49-49: Fix minor typo: double space in method signature.

Minor cleanup for readability.

-    public static  Processor getProcessor() {
+    public static Processor getProcessor() {
dspace-oai/src/main/java/org/dspace/xoai/services/impl/resources/functions/NodeXslFunction.java (3)

49-51: Tighten function signature: return and arg cardinality should be ZERO_OR_ONE.

The function returns either a single node or empty sequence; likewise it reads only the first string value if provided. Reflect this in the declared types.

     final public SequenceType getResultType() {
-        return SequenceType.makeSequenceType(ItemType.ANY_NODE, OccurrenceIndicator.ZERO_OR_MORE);
+        return SequenceType.makeSequenceType(ItemType.ANY_NODE, OccurrenceIndicator.ZERO_OR_ONE);
     }
@@
     final public SequenceType[] getArgumentTypes() {
         return new SequenceType[]{
                 SequenceType.makeSequenceType(
-                        ItemType.STRING, OccurrenceIndicator.ZERO_OR_MORE)};
+                        ItemType.STRING, OccurrenceIndicator.ZERO_OR_ONE)};
     }

Also applies to: 55-58


84-87: Clarify comment: we now create a fresh DocumentBuilder per call.

The comment block mentions “shared document builder” but the implementation correctly creates a new builder each time. Align the comment with the code.

-            // Use the shared document builder to ensure configuration compatibility
-            // and wrap in try-catch to handle any remaining configuration issues
-            // Create a fresh DocumentBuilder per call; Saxon's DocumentBuilder is not thread-safe
+            // Create a fresh DocumentBuilder per call (Saxon's DocumentBuilder is not thread-safe)
+            // Using the shared Processor from SharedSaxonProcessor to ensure consistent configuration

61-65: Optional: avoid exception-driven control flow for empty/missing args.

Current try/catch is fine, but you can pre-check size to avoid exceptions on common empty cases.

-        String val;
-        try {
-            val = xdmValues[0].itemAt(0).getStringValue();
-        } catch (Exception e) {
-            // e.g. when no parameter is passed and xdmValues[0] ends with index error
-            log.warn("Empty value in call of function {}, returning empty sequence", getFnName());
-            return XdmEmptySequence.getInstance();
-        }
+        String val;
+        if (xdmValues[0].size() == 0) {
+            log.warn("Empty value in call of function {}, returning empty sequence", getFnName());
+            return XdmEmptySequence.getInstance();
+        }
+        val = xdmValues[0].itemAt(0).getStringValue();

Also applies to: 69-74

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

💡 Knowledge Base configuration:

  • MCP integration is disabled by default for public repositories
  • Jira integration is disabled by default for public repositories
  • Linear integration is disabled by default for public repositories

You can enable these settings in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between 23a0781 and c172bd7.

📒 Files selected for processing (3)
  • dspace-oai/src/main/java/org/dspace/xoai/services/impl/resources/SharedSaxonProcessor.java (1 hunks)
  • dspace-oai/src/main/java/org/dspace/xoai/services/impl/resources/functions/NodeListXslFunction.java (3 hunks)
  • dspace-oai/src/main/java/org/dspace/xoai/services/impl/resources/functions/NodeXslFunction.java (3 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • dspace-oai/src/main/java/org/dspace/xoai/services/impl/resources/functions/NodeListXslFunction.java
🧰 Additional context used
🧬 Code Graph Analysis (1)
dspace-oai/src/main/java/org/dspace/xoai/services/impl/resources/functions/NodeXslFunction.java (1)
dspace-oai/src/main/java/org/dspace/xoai/services/impl/resources/SharedSaxonProcessor.java (1)
  • SharedSaxonProcessor (22-77)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (5)
  • GitHub Check: dspace-test / docker-build (linux/amd64, ubuntu-latest, true)
  • GitHub Check: dspace / docker-build (linux/amd64, ubuntu-latest, true)
  • GitHub Check: dspace-cli / docker-build (linux/amd64, ubuntu-latest, true)
  • GitHub Check: Run Unit Tests
  • GitHub Check: Run Integration Tests
🔇 Additional comments (2)
dspace-oai/src/main/java/org/dspace/xoai/services/impl/resources/functions/NodeXslFunction.java (2)

77-92: LGTM on using SharedSaxonProcessor and per-call builder.

This avoids per-call Processor creation and the non-thread-safe builder reuse. Good error handling and logging.


1-100: Verified: SharedSaxonProcessor is initialized in DSpaceResourceResolver static block

rg search shows SharedSaxonProcessor.initialize(...) is invoked in:

  • dspace-oai/src/main/java/org/dspace/xoai/services/impl/resources/DSpaceResourceResolver.java:49 (static initializer calls SharedSaxonProcessor.initialize(transformerFactory))

Relevant usage:

  • dspace-oai/src/main/java/org/dspace/xoai/services/impl/resources/functions/NodeXslFunction.java — calls SharedSaxonProcessor.getProcessor() in call(...)

Additional checks:

  • No direct "new Processor(" occurrences found.
  • No matches for SharedSaxonProcessor.getProcessor/getDocumentBuilder/getTransformerFactory were found elsewhere.

Conclusion: initialization is handled in DSpaceResourceResolver's static block; no change required.

@Paurikova2 Paurikova2 requested a review from Copilot August 14, 2025 12:10

This comment was marked as outdated.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🔭 Outside diff range comments (1)
dspace-oai/src/main/java/org/dspace/xoai/services/impl/resources/functions/ListXslFunction.java (1)

13-24: Drop org.bouncycastle.util.Arrays import — use JDK Arrays.

Avoid unnecessary crypto dependency for simple array utilities.

 import java.util.Objects;
+import java.util.Arrays;
@@
-import org.apache.logging.log4j.Logger;
-
-import org.bouncycastle.util.Arrays;
+import org.apache.logging.log4j.Logger;
♻️ Duplicate comments (3)
dspace-oai/src/main/java/org/dspace/xoai/services/impl/resources/SharedSaxonProcessor.java (1)

28-30: Ensure safe publication: mark shared fields volatile (avoid stale reads across threads).

Getters are unsynchronized, so other threads may observe stale values without volatile. Initialize remains synchronized; volatile on both fields guarantees visibility after initialization.

-    private static SaxonTransformerFactory saxonTransformerFactory;
-    private static Processor sharedProcessor;
+    private static volatile SaxonTransformerFactory saxonTransformerFactory;
+    private static volatile Processor sharedProcessor;
dspace-oai/src/main/java/org/dspace/xoai/services/impl/resources/functions/NodeXslFunction.java (2)

5-20: Remove Bouncy Castle Arrays and use JDK Arrays for null/empty checks.

Use standard JDK utilities; drop an unnecessary dependency.

 import java.util.Objects;
+import java.util.Arrays;
 import javax.xml.transform.dom.DOMSource;
@@
 import net.sf.saxon.s9api.XdmValue;
 import org.apache.logging.log4j.Logger;
-import org.bouncycastle.util.Arrays;
 import org.dspace.xoai.services.impl.resources.SharedSaxonProcessor;

50-54: Harden arguments check and handle empty-args explicitly (no BC dependency).

Current check misses empty array case and relies on BC Arrays. Replace with JDK-based guard.

-    public final XdmValue call(XdmValue[] xdmValues) throws SaxonApiException {
-        if (Objects.isNull(xdmValues) || Arrays.isNullOrContainsNull(xdmValues)) {
-            log.debug("Null or empty parameters passed to {}, returning empty sequence", getFnName());
-            return XdmEmptySequence.getInstance();
-        }
+    public final XdmValue call(XdmValue[] xdmValues) throws SaxonApiException {
+        if (xdmValues == null || xdmValues.length == 0
+                || Arrays.stream(xdmValues).anyMatch(Objects::isNull)) {
+            log.debug("Null or empty parameters passed to {}, returning empty sequence", getFnName());
+            return XdmEmptySequence.getInstance();
+        }
🧹 Nitpick comments (6)
dspace-oai/src/main/java/org/dspace/xoai/services/impl/resources/SharedSaxonProcessor.java (2)

26-27: Prevent instantiation of utility class.

Add a private constructor to make the intent explicit and avoid accidental instantiation.

 public class SharedSaxonProcessor {
 
+    private SharedSaxonProcessor() {
+        // utility class
+    }

74-79: Minor: reuse getProcessor() guard in getDocumentBuilder().

Keeps the null-check centralized and consistent.

-    public static DocumentBuilder getDocumentBuilder() {
-        if (sharedProcessor == null) {
-            throw new IllegalStateException("SharedSaxonProcessor has not been initialized. Call initialize() first.");
-        }
-        return sharedProcessor.newDocumentBuilder();
-    }
+    public static DocumentBuilder getDocumentBuilder() {
+        return getProcessor().newDocumentBuilder();
+    }
dspace-oai/src/main/java/org/dspace/xoai/services/impl/resources/functions/ListXslFunction.java (1)

77-80: Log actual number of processed items (not xdmValues.length).

After flattening, xdmValues.length no longer reflects processed items. Track and log paramCount.

-        String finalResponse = response.toString();
-        log.debug("Function {} processed {} parameters and returned response of length {}",
-                getFnName(), xdmValues.length, finalResponse.length());
-        return new XdmAtomicValue(finalResponse);
+        String finalResponse = response.toString();
+        log.debug("Function {} processed {} parameters and returned response of length {}",
+                getFnName(), paramCount, finalResponse.length());
+        return new XdmAtomicValue(finalResponse);
dspace-oai/src/main/java/org/dspace/xoai/services/impl/resources/functions/NodeXslFunction.java (3)

64-68: Fix line-length violation in debug log.

This log line likely exceeds 120 chars and will trip Checkstyle. Wrap arguments across lines.

-        if (node == null) {
-            log.debug("Function {} returned null node for parameter '{}', returning empty sequence", getFnName(), val);
-            return XdmEmptySequence.getInstance();
-        }
+        if (node == null) {
+            log.debug(
+                "Function {} returned null node for parameter '{}', returning empty sequence",
+                getFnName(), val);
+            return XdmEmptySequence.getInstance();
+        }

72-76: Optional: use SharedSaxonProcessor.getDocumentBuilder() for consistency.

Semantics are identical (it creates a new builder per call) and centralizes the builder creation policy.

-            XdmNode xdmNode = SharedSaxonProcessor.getProcessor()
-                    .newDocumentBuilder()
-                    .build(new DOMSource(node));
+            XdmNode xdmNode = SharedSaxonProcessor.getDocumentBuilder()
+                    .build(new DOMSource(node));

75-76: Fix potential line-length issue in success log.

Wrap to satisfy 120-char limit.

-            log.debug("Function {} successfully processed parameter '{}' and returned node", getFnName(), val);
+            log.debug(
+                "Function {} successfully processed parameter '{}' and returned node",
+                getFnName(), val);
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

💡 Knowledge Base configuration:

  • MCP integration is disabled by default for public repositories
  • Jira integration is disabled by default for public repositories
  • Linear integration is disabled by default for public repositories

You can enable these settings in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between c172bd7 and 80fbeee.

📒 Files selected for processing (4)
  • dspace-oai/src/main/java/org/dspace/xoai/services/impl/resources/SharedSaxonProcessor.java (1 hunks)
  • dspace-oai/src/main/java/org/dspace/xoai/services/impl/resources/functions/ListXslFunction.java (3 hunks)
  • dspace-oai/src/main/java/org/dspace/xoai/services/impl/resources/functions/NodeListXslFunction.java (3 hunks)
  • dspace-oai/src/main/java/org/dspace/xoai/services/impl/resources/functions/NodeXslFunction.java (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • dspace-oai/src/main/java/org/dspace/xoai/services/impl/resources/functions/NodeListXslFunction.java
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (5)
  • GitHub Check: dspace-cli / docker-build (linux/amd64, ubuntu-latest, true)
  • GitHub Check: dspace / docker-build (linux/amd64, ubuntu-latest, true)
  • GitHub Check: dspace-test / docker-build (linux/amd64, ubuntu-latest, true)
  • GitHub Check: Run Integration Tests
  • GitHub Check: Run Unit Tests

@Paurikova2 Paurikova2 requested a review from Copilot August 14, 2025 12:55

This comment was marked as outdated.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (4)
dspace-oai/src/main/java/org/dspace/xoai/services/impl/resources/SharedSaxonProcessor.java (1)

27-29: Ensure safe publication: declare shared fields volatile

Getters are unsynchronized; without volatile, other threads may observe stale values. Mark these fields volatile to guarantee visibility after initialize() completes.

-    private static SaxonTransformerFactory saxonTransformerFactory;
-    private static Processor sharedProcessor;
+    private static volatile SaxonTransformerFactory saxonTransformerFactory;
+    private static volatile Processor sharedProcessor;
dspace-oai/src/main/java/org/dspace/xoai/services/impl/resources/functions/NodeXslFunction.java (3)

71-75: Fix line-length violation by wrapping the debug log

Keep within the 120-char limit.

-            log.debug("Function {} returned null node for parameter '{}', returning empty sequence", getFnName(), val);
+            log.debug(
+                "Function {} returned null node for parameter '{}', returning empty sequence",
+                getFnName(),
+                val
+            );

12-27: Remove Bouncy Castle Arrays; use JDK Arrays and fix imports

BC Arrays isn’t appropriate here and introduces an unnecessary dependency. Switch to java.util.Arrays.

 import java.util.Objects;
+import java.util.Arrays;
 import javax.xml.transform.dom.DOMSource;
@@
-import org.bouncycastle.util.Arrays;

58-61: Harden argument guard and handle empty array

Handle empty-args safely and remove BC utility usage.

-        if (Objects.isNull(xdmValues) || Arrays.isNullOrContainsNull(xdmValues)) {
-            log.debug("Null or empty parameters passed to {}, returning empty sequence", getFnName());
+        if (xdmValues == null || xdmValues.length == 0
+                || Arrays.stream(xdmValues).anyMatch(Objects::isNull)) {
+            log.debug("Null or empty parameters passed to {}, returning empty sequence", getFnName());
             return XdmEmptySequence.getInstance();
         }
🧹 Nitpick comments (3)
dspace-oai/src/main/java/org/dspace/xoai/services/impl/resources/SharedSaxonProcessor.java (1)

25-25: Make the utility class non-instantiable and final

Prevent accidental instantiation and subclassing of this static utility.

-public class SharedSaxonProcessor {
+public final class SharedSaxonProcessor {
+    private SharedSaxonProcessor() {
+        // utility class
+    }
dspace-oai/src/main/java/org/dspace/xoai/services/impl/resources/functions/ListXslFunction.java (2)

56-85: Count processed items accurately in logs

The current log uses xdmValues.length, which counts argument containers, not sequence items. Track itemCount to reflect the real number processed.

-        StringBuilder response = new StringBuilder();
+        StringBuilder response = new StringBuilder();
+        int itemCount = 0;
@@
-        for (XdmValue arg : xdmValues) {
-            if (arg == null || arg.size() == 0) continue;
+        for (XdmValue arg : xdmValues) {
+            if (arg == null || arg.size() == 0) continue;
@@
-                    if (result != null) {
-                        response.append(result);
-                    }
+                    if (result != null) {
+                        response.append(result);
+                    }
+                    itemCount++;
@@
-        String finalResponse = response.toString();
-        log.debug("Function {} processed {} parameters and returned response of length {}",
-                getFnName(), xdmValues.length, finalResponse.length());
+        String finalResponse = response.toString();
+        log.debug("Function {} processed {} items across {} argument value(s) and returned response of length {}",
+                getFnName(), itemCount, xdmValues.length, finalResponse.length());
         return new XdmAtomicValue(finalResponse);

39-49: Unify modifier order for consistency (“public final”)

Other classes in this PR use “public final”. Align this file for consistency.

-    final public QName getName() {
+    public final QName getName() {
@@
-    final public SequenceType getResultType() {
+    public final SequenceType getResultType() {
@@
-    final public SequenceType[] getArgumentTypes() {
+    public final SequenceType[] getArgumentTypes() {
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

💡 Knowledge Base configuration:

  • MCP integration is disabled by default for public repositories
  • Jira integration is disabled by default for public repositories
  • Linear integration is disabled by default for public repositories

You can enable these settings in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between 80fbeee and 2fbc99b.

📒 Files selected for processing (3)
  • dspace-oai/src/main/java/org/dspace/xoai/services/impl/resources/SharedSaxonProcessor.java (1 hunks)
  • dspace-oai/src/main/java/org/dspace/xoai/services/impl/resources/functions/ListXslFunction.java (3 hunks)
  • dspace-oai/src/main/java/org/dspace/xoai/services/impl/resources/functions/NodeXslFunction.java (1 hunks)
🧰 Additional context used
🧬 Code Graph Analysis (1)
dspace-oai/src/main/java/org/dspace/xoai/services/impl/resources/functions/NodeXslFunction.java (1)
dspace-oai/src/main/java/org/dspace/xoai/services/impl/resources/SharedSaxonProcessor.java (1)
  • SharedSaxonProcessor (25-77)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (5)
  • GitHub Check: dspace-test / docker-build (linux/amd64, ubuntu-latest, true)
  • GitHub Check: dspace-cli / docker-build (linux/amd64, ubuntu-latest, true)
  • GitHub Check: dspace / docker-build (linux/amd64, ubuntu-latest, true)
  • GitHub Check: Run Integration Tests
  • GitHub Check: Run Unit Tests
🔇 Additional comments (4)
dspace-oai/src/main/java/org/dspace/xoai/services/impl/resources/SharedSaxonProcessor.java (1)

39-50: LGTM: robust initialization guard and type validation

Null and type checks before caching the factory are correct; synchronized initialize() prevents concurrent init. Nice.

dspace-oai/src/main/java/org/dspace/xoai/services/impl/resources/functions/ListXslFunction.java (1)

56-61: LGTM: hardened null/empty guard using JDK Arrays

The new guard correctly handles null, empty, and null elements without the BC dependency.

dspace-oai/src/main/java/org/dspace/xoai/services/impl/resources/functions/NodeXslFunction.java (2)

77-84: LGTM: per-call DocumentBuilder from shared Processor

Creating a new DocumentBuilder per call avoids thread-safety issues and uses the shared Processor correctly.


50-53: Argument type: restrict to a single optional string — please verify XSL callers

Only the first item is used; tightening the declared argument to a single optional string is correct, but it may be breaking if any stylesheet actually passes a multi-item sequence. I scanned the repo and found NodeXslFunction-derived calls only in the Lindat crosswalk — please confirm those expressions yield at most one item before changing the signature.

Files to check:

  • Change target:
    • dspace-oai/src/main/java/org/dspace/xoai/services/impl/resources/functions/NodeXslFunction.java
  • XSL callers (inspect these and ensure each argument produces ≤1 item):
    • dspace/config/crosswalks/oai/metadataFormats/lindat_cmdi.xsl
      • line 33: <xsl:variable name="uploaded_md" select="fn:getUploadedMetadata($handle)"/>
      • line 169: <xsl:copy-of select="fn:getAuthor(.)"/>
      • line 183: <xsl:copy-of select="fn:getFunding(.)"/>
      • line 187: <xsl:copy-of select="fn:getContact(doc:metadata/doc:element[@name='local']/doc:element[@name='contact']/doc:element[@name='person']/doc:element/doc:field[@name='value'])"/>
      • line 239: <xsl:copy-of select="fn:getSize(.)"/>

Suggested diff (as originally proposed):

-        return new SequenceType[]{
-                SequenceType.makeSequenceType(ItemType.STRING, OccurrenceIndicator.ZERO_OR_MORE)
-        };
+        return new SequenceType[]{
+                SequenceType.makeSequenceType(ItemType.STRING, OccurrenceIndicator.ZERO_OR_ONE)
+        };

Action required: verify the Lindat selectors above cannot return multiple nodes. If any can, either (a) keep ZERO_OR_MORE, (b) change the XSL to pass a single item (e.g., use [1] or string(...)), or (c) update the function implementation to explicitly handle multi-item input while keeping the declared type.

@Paurikova2 Paurikova2 requested a review from Copilot August 14, 2025 13:05
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR addresses Saxon configuration issues in the DSpace OAI module by implementing a unified Saxon processor approach to ensure consistent XSLT processing and extension function compatibility. The changes improve error handling, thread safety, and debugging capabilities while updating the deployment script to version 7.6.5.

  • Introduced a shared Saxon processor singleton to avoid configuration conflicts
  • Enhanced error handling and logging in XSLT extension functions with more robust null parameter handling
  • Updated build script to deploy OAI module version 7.6.5

Reviewed Changes

Copilot reviewed 6 out of 6 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
scripts/fast-build/oai-pmh-package-update.bat Updates JAR version from 7.6.1 to 7.6.5 for deployment
dspace-oai/src/main/java/org/dspace/xoai/services/impl/resources/SharedSaxonProcessor.java New utility class providing singleton Saxon processor for configuration consistency
dspace-oai/src/main/java/org/dspace/xoai/services/impl/resources/DSpaceResourceResolver.java Modified to use shared Saxon processor for extension function registration
dspace-oai/src/main/java/org/dspace/xoai/services/impl/resources/functions/NodeXslFunction.java Enhanced error handling and logging, uses shared processor for thread-safe document building
dspace-oai/src/main/java/org/dspace/xoai/services/impl/resources/functions/NodeListXslFunction.java Simplified implementation with better error handling and stream-based list processing
dspace-oai/src/main/java/org/dspace/xoai/services/impl/resources/functions/ListXslFunction.java Improved parameter processing and error handling with enhanced logging

Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (1)
dspace-oai/src/main/java/org/dspace/xoai/services/impl/resources/functions/NodeXslFunction.java (1)

21-27: Replace BouncyCastle Arrays usages with JDK Arrays and explicitly handle empty args

rg shows org.bouncycastle.util.Arrays is imported in multiple functions — replace the import and harden the null/empty checks to avoid using exceptions for control flow.

Files to update:

  • dspace-oai/src/main/java/org/dspace/xoai/services/impl/resources/functions/NodeXslFunction.java
  • dspace-oai/src/main/java/org/dspace/xoai/services/impl/resources/functions/StringXSLFunction.java
  • dspace-oai/src/main/java/org/dspace/xoai/services/impl/resources/functions/NodeListXslFunction.java
  • dspace-oai/src/main/java/org/dspace/xoai/services/impl/resources/functions/LogMissingFn.java
  • dspace-oai/src/main/java/org/dspace/xoai/services/impl/resources/functions/LogMissingMsgFn.java

Apply this change in each file (adjust variable names if different):

@@
-import org.bouncycastle.util.Arrays;
+import java.util.Arrays;
@@
-        if (Objects.isNull(xdmValues) || Arrays.isNullOrContainsNull(xdmValues)) {
+        if (xdmValues == null || xdmValues.length == 0
+                || Arrays.stream(xdmValues).anyMatch(Objects::isNull)) {
             log.debug("Null or empty parameters passed to {}, returning empty sequence", getFnName());
             return XdmEmptySequence.getInstance();
         }

After updating all occurrences, re-run the grep to confirm no remaining org.bouncycastle.util.Arrays imports before removing the dependency from the build.

🧹 Nitpick comments (5)
dspace-oai/src/main/java/org/dspace/xoai/services/impl/resources/functions/ListXslFunction.java (1)

83-87: Fix misleading “processed N parameters” metric in debug log

The log currently reports xdmValues.length, which is the number of argument slots (likely 1 for a string* signature), not the number of items actually processed. Track the real item count instead.

Apply this diff:

-        String finalResponse = response.toString();
-        log.debug("Function {} processed {} parameters and returned response of length {}",
-                getFnName(), xdmValues.length, finalResponse.length());
+        String finalResponse = response.toString();
+        // Log the actual number of items processed across all sequences
+        log.debug(
+                "Function {} processed {} item(s) and returned response of length {}",
+                getFnName(), response.length() == 0 ? 0 : finalResponse.split("", -1).length - 1, finalResponse.length());
+        // If you prefer precise counting, introduce a paramCount variable and increment it per item.

If you want precise counting without the string-split hack, introduce a dedicated counter:

-        StringBuilder response = new StringBuilder();
+        StringBuilder response = new StringBuilder();
+        int paramCount = 0;
@@
-                    if (result != null) {
+                    if (result != null) {
                         response.append(result);
                     }
+                    paramCount++;
@@
-        log.debug("Function {} processed {} parameters and returned response of length {}",
-                getFnName(), xdmValues.length, finalResponse.length());
+        log.debug(
+                "Function {} processed {} item(s) and returned response of length {}",
+                getFnName(), paramCount, finalResponse.length());
dspace-oai/src/main/java/org/dspace/xoai/services/impl/resources/SharedSaxonProcessor.java (2)

15-24: Javadoc mentions DocumentBuilder, but the class doesn’t expose one

The comments reference DocumentBuilder instances being created per call, but this class only exposes Processor and TransformerFactory. Update the wording to avoid confusion.

Apply this diff:

- * This class maintains a singleton Processor and TransformerFactory.
- * Synchronization ensures thread-safety if multiple threads attempt initialization concurrently.
- * DocumentBuilder instances are created per call because they are not thread-safe.
+ * This class maintains a singleton Processor and TransformerFactory.
+ * Synchronization ensures thread-safety if multiple threads attempt initialization concurrently.
+ * Downstream code should create DocumentBuilder instances per call (they are not thread-safe).

42-53: Improve error messages with actual type info

When rejecting a non-Saxon TransformerFactory (or null), include the runtime type in the message to aid diagnostics.

Apply this diff:

-        if (saxonTransformerFactory == null) {
+        if (saxonTransformerFactory == null) {
             if (transformerFactory == null) {
-                throw new IllegalArgumentException("TransformerFactory cannot be null");
+                throw new IllegalArgumentException("TransformerFactory cannot be null");
             }
             if (!(transformerFactory instanceof SaxonTransformerFactory)) {
-                throw new IllegalArgumentException("TransformerFactory must be an instance of SaxonTransformerFactory");
+                throw new IllegalArgumentException(
+                    "TransformerFactory must be an instance of SaxonTransformerFactory, but was: "
+                        + transformerFactory.getClass().getName());
             }
             saxonTransformerFactory = (SaxonTransformerFactory) transformerFactory;
             sharedProcessor = saxonTransformerFactory.getProcessor();
         }
dspace-oai/src/main/java/org/dspace/xoai/services/impl/resources/functions/NodeXslFunction.java (2)

71-75: Split long debug log for style and readability

This single-line debug message is lengthy; split across lines to avoid exceeding typical 120-char limits and improve readability.

-        if (node == null) {
-            log.debug("Function {} returned null node for parameter '{}', returning empty sequence", getFnName(), val);
-            return XdmEmptySequence.getInstance();
-        }
+        if (node == null) {
+            log.debug(
+                "Function {} returned null node for parameter '{}', returning empty sequence",
+                getFnName(), val);
+            return XdmEmptySequence.getInstance();
+        }

85-88: Log the Throwable directly (optional simplification)

You currently log the error message at ERROR and the stack trace at DEBUG. If acceptable, simplify by logging once with the Throwable at ERROR (or keep both if you prefer the current separation).

Option A (single error with stack trace):

-            log.error("Error in function {} processing parameter '{}': {}", getFnName(), val, e.getMessage());
-            log.debug("Full stack trace for function {} error:", getFnName(), e);
+            log.error("Error in function {} processing parameter '{}'", getFnName(), val, e);

Option B (keep ERROR message, but ensure clear DEBUG stack line):

-            log.error("Error in function {} processing parameter '{}': {}", getFnName(), val, e.getMessage());
-            log.debug("Full stack trace for function {} error:", getFnName(), e);
+            log.error("Error in function {} processing parameter '{}': {}", getFnName(), val, e.getMessage());
+            log.debug("Full stack trace for function {} error", getFnName(), e);
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

💡 Knowledge Base configuration:

  • MCP integration is disabled by default for public repositories
  • Jira integration is disabled by default for public repositories
  • Linear integration is disabled by default for public repositories

You can enable these settings in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between d914148 and 9d33d55.

📒 Files selected for processing (3)
  • dspace-oai/src/main/java/org/dspace/xoai/services/impl/resources/SharedSaxonProcessor.java (1 hunks)
  • dspace-oai/src/main/java/org/dspace/xoai/services/impl/resources/functions/ListXslFunction.java (3 hunks)
  • dspace-oai/src/main/java/org/dspace/xoai/services/impl/resources/functions/NodeXslFunction.java (1 hunks)
🧰 Additional context used
🧬 Code Graph Analysis (1)
dspace-oai/src/main/java/org/dspace/xoai/services/impl/resources/functions/NodeXslFunction.java (1)
dspace-oai/src/main/java/org/dspace/xoai/services/impl/resources/SharedSaxonProcessor.java (1)
  • SharedSaxonProcessor (25-80)
🔇 Additional comments (2)
dspace-oai/src/main/java/org/dspace/xoai/services/impl/resources/functions/ListXslFunction.java (2)

56-61: Robust JDK-based guard looks good

Good switch to JDK Arrays/Objects with a correct short-circuit null/empty check before streaming. Prevents NPEs and removes the BC dependency.


65-81: Correctly iterates sequence items within each XdmValue

The nested iteration fixes the “first item only” bug and safely skips null/empty args while continuing on per-item exceptions. Good resilience.

@milanmajchrak milanmajchrak merged commit e338cda into dtq-dev Aug 18, 2025
11 of 12 checks passed
@Paurikova2 Paurikova2 requested a review from vidiecan September 2, 2025 09:15
@Paurikova2
Copy link
Collaborator Author

Paurikova2 commented Sep 9, 2025

@vidiecan
Problem:
In the OAI-PMH subsystem, the NodeXslFunction class was creating XdmNodes using new Processor instances:

DocumentBuilder db = new Processor(false).newDocumentBuilder();
var res = db.build(new DOMSource(node));
return res;

Each Processor has its own Configuration.
The main XSLT transformer used a different Processor, so Saxon rejected the nodes:
Node returned by extension function was created with an incompatible Configuration

Why it worked before:
In DSpace 7.6.1 (Saxon 9.8.0-14), Saxon tolerated nodes created by different Processor instances. After upgrading to DSpace 7.6.5 (Saxon 9.9.1-8), Saxon enforces strict checks, causing this error.

https://www.saxonica.com/html/documentation12/changes/v9.9/s9api.html

milanmajchrak pushed a commit that referenced this pull request Sep 10, 2025
… module (#1041)

* used static, shared isntance of the sazon processor and document builder

* fix problem xith saxon, works

* remvoed updated and stashed notes

* removed synchronization, checkstyle

* more readable code

* fix array usagage, checkstyle

* added volatile to shared sazon processor

* checkstyle violations, created final class from saxon

* fix doc for saxon, check if array exists before if it contains values
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants