-
Notifications
You must be signed in to change notification settings - Fork 0
Feat: Implementations now pull schemas #7
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
WalkthroughSchema management moved from embedded/local files to configurable remote ZIP sources; Java and Rust loaders now fetch and cache schemas at startup. CI workflows switched to implementation-driven versioning and dynamic changelog generation. Many local JSON schema files and Changes
Sequence DiagramsequenceDiagram
participant App as Application
participant Loader as SchemaLoader
participant Remote as Remote Source (GitHub ZIP)
participant Cache as Schema Cache
participant FS as Filesystem (Java only)
App->>Loader: new(schemaRoot, domain, version)
Loader->>Loader: loadSourcesConfig()
Loader->>Remote: fetch ZIP(s)
Remote-->>Loader: ZIP stream(s)
Loader->>Loader: parse ZIP entries (in-memory)
Loader->>Cache: populate with JSON schemas
Loader-->>App: ready
App->>Loader: loadSchema(category, name)
alt Cache hit
Loader->>Cache: lookup
Cache-->>Loader: schema
Loader-->>App: schema
else Cache miss (Java)
Loader->>FS: attempt filesystem read
alt File found
FS-->>Loader: schema
Loader-->>App: schema
else
Loader-->>App: null
end
else Cache miss (Rust)
Loader->>Loader: panic / fail fast
end
Estimated code review effort🎯 4 (Complex) | ⏱️ ~70 minutes Possibly related PRs
🚥 Pre-merge checks | ✅ 3✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 11
🤖 Fix all issues with AI agents
In @.github/workflows/ci-java.yml:
- Around line 64-66: The mv commands reference a nonexistent step output
(steps.version.outputs.impl_version) causing empty substitutions; update those
expressions to use the correct step id (steps.impl.outputs.impl_version) or
simply remove the entire rename block since the combined-version build was
removed and the mv operations are no-ops (renaming files to the same name).
Locate the three mv lines (and duplicates later around lines referenced in the
review) that move target/pacts-${{ steps.impl.outputs.impl_version }}*.jar and
either replace steps.version with steps.impl in all occurrences or delete the mv
block entirely to avoid failing the workflow.
In @.github/workflows/ci-rust.yml:
- Around line 56-71: The workflow uses the wrong step id: references to
steps.version.outputs.impl_version should be steps.impl.outputs.impl_version;
update all occurrences of steps.version.outputs.impl_version (used in tag_name,
name, body, and files for libpacts artifact) to steps.impl.outputs.impl_version
so the GitHub Action reads the implementation version from the actual step id
`impl`.
In `@java/src/main/java/net/hydrius/pacts/core/SchemaLoader.java`:
- Around line 106-141: The hand-rolled YAML parsing in loadRemoteSchemas
incorrectly treats every line starting with "- " as a source; replace it with a
proper YAML parse using SnakeYAML or Jackson's YAML module to load
application.yml into a JsonNode/Map and then safely extract the "sources" array
(use objectMapper.convertValue or similar to turn the parsed YAML into
JsonNode), updating the ObjectNode settings only with the actual sources node;
ensure you reference loadRemoteSchemas, settings, objectMapper, and the
"sources" key when locating the change and add appropriate null-checks and
exception logging for malformed or missing "sources".
- Around line 150-152: The HttpURLConnection in SchemaLoader's constructor
(where URL url = new URL(source); HttpURLConnection connection =
(HttpURLConnection) url.openConnection(); connection.setRequestMethod("GET");)
lacks connect/read timeouts and can block indefinitely; set reasonable timeouts
(e.g., connection.setConnectTimeout(...) and connection.setReadTimeout(...))
before connecting (or before any read) to prevent startup hangs—prefer using
sensible defaults or pulling values from config if available and ensure these
calls occur on the same HttpURLConnection instance created in the constructor.
- Around line 148-214: The ZIP/HTTP streams and connection leak on exceptions in
the schema-loading loop; wrap the URL/connection/input streams in
try-with-resources so InputStream `in` and `ZipInputStream zipIn` are closed
automatically (e.g., create the connection, then use try (InputStream in =
connection.getInputStream(); ZipInputStream zipIn = new ZipInputStream(in)) {
... } ), move the per-entry processing inside that block, and ensure the
HttpURLConnection `connection` is disconnected in a finally block (or call
connection.disconnect() after the try-with-resources) so `connection`, `in`, and
`zipIn` are always cleaned up even on ZipException/IOException in the code that
reads entries and populates `cache`.
In `@java/src/test/java/net/hydrius/pacts/core/SchemaLoaderTest.java`:
- Around line 11-19: The test currently invokes SchemaLoader(...) which calls
loadRemoteSchemas() and performs real network I/O on every `@BeforeEach` run under
`@TestInstance`(PER_CLASS); to fix, refactor so tests avoid network by either (A)
changing the test to use `@BeforeAll` instead of `@BeforeEach` if you intend
PER_CLASS, or (B) instantiate SchemaLoader with a test-friendly
constructor/parameter or dependency injection (e.g., a Supplier/Provider or
boolean skipRemote flag) so the constructor does not call loadRemoteSchemas(),
or (C) create a test subclass or mock of SchemaLoader that overrides
loadRemoteSchemas() to no-op; update SchemaLoader's constructors (and the test
setup method) accordingly and ensure tests call the non-networking constructor
or override method instead of triggering real downloads.
In `@java/src/test/resources/application.yml`:
- Around line 9-10: Tests currently depend on a live GitHub URL in
application.yml which causes flaky/failed tests offline or when remote is
unavailable; update tests to be hermetic by either (a) bundling a small test
schema into test resources and point the test config to a local file URL instead
of "https://..." or (b) mock/stub the network fetch in SchemaLoader (e.g., stub
the method that fetches the ZIP or inject a test SchemaLoader implementation) so
tests use the packaged test schema; change the test application.yml value and/or
add a test-only SchemaLoader mock and ensure test setup uses that mock.
In `@rust/build.rs`:
- Around line 14-19: The code currently walks three parent() levels from OUT_DIR
to compute target_dir which is fragile; stop traversing parents and use OUT_DIR
directly (the out_dir variable from env::var("OUT_DIR")). Replace any use of the
computed target_dir with out_dir and write build artifacts into OUT_DIR, and
update runtime usage to load those artifacts via include_str! or
env!("OUT_DIR")/concat! as appropriate so the runtime reads the file from the
canonical build-script output location instead of relying on Cargo's internal
layout.
In `@rust/Cargo.toml`:
- Line 15: Update the zip dependency in Cargo.toml from version "0.6" to "7.4.0"
and then audit usages of the crate (look for ZipWriter, ZipArchive, ZipOptions
and any functions that construct/read archives) to accommodate breaking changes
in 7.x: change any removed or renamed method calls, adjust error types and
Result handling, and update feature names if necessary (ensure the equivalent
compression feature to "deflate" is enabled); run cargo build/tests and fix API
mismatches accordingly.
In `@rust/src/core/schema_loader.rs`:
- Around line 119-167: The load_sources_config function currently opens a
hardcoded "rust/resources/sources.yaml" and parses it line-by-line; instead,
embed or properly locate the YAML and use a real YAML parser: replace the manual
file-open + line parsing in load_sources_config with one of two approaches — (A)
embed the config at compile time via include_str!("resources/sources.yaml") (or
include_bytes!) and parse that string using serde_yaml::from_str to deserialize
into Vec<String>, or (B) locate the runtime copy from build.rs by reading the
path from OUT_DIR (or a well-known config path) and then parse with
serde_yaml::from_reader; remove the fragile ad-hoc trimming logic and ensure
errors propagate (return Err) instead of silently falling back to the hardcoded
URL so callers can fail fast if configuration is invalid.
- Around line 86-110: In load_remote_schemas, currently a successful return from
load_schemas_from_zip_url sets sources_loaded = true even if no schemas were
actually added; change the logic so after Ok(_) you verify the schema cache is
non-empty (inspect whatever in-memory store is used by the loader, e.g.,
self.schemas or self.cache) and only set sources_loaded = true and break when at
least one .json schema was loaded; if the cache remains empty, treat that Ok as
a failure (log and continue) so partial/empty loads do not silently mark success
for load_schemas_from_zip_url and load_remote_schemas.
🧹 Nitpick comments (8)
rust/resources/sources.yaml (1)
1-2: Consider pinning to a specific tag or commit instead ofmain.Fetching from
refs/heads/mainmeans any push to the Schemas repo immediately affects all consumers. A tagged release or commit SHA would give you reproducible builds and explicit upgrade control.java/src/main/resources/application.yml (1)
5-6: Same pinning concern as the Rustsources.yaml.Using
refs/heads/mainmakes builds non-reproducible. Consider pinning to a tagged release. Additionally,sourcesas a top-level Spring config key is quite generic — a namespaced key likepacts.sourceswould reduce the risk of collisions with other Spring configuration.java/src/test/java/net/hydrius/pacts/impl/PactsServiceTest.java (1)
23-27: Test setup performs live network I/O on every test method.
@BeforeEachcombined with@TestInstance(PER_CLASS)meanssetup()runs before each@Test. If theSchemaLoaderconstructor fetches remote schemas, this issues a network call per test. Consider using@BeforeAll(alreadyPER_CLASSlifecycle) to initialize once, or caching/mocking the loader.rust/Cargo.toml (1)
17-19: Commented-out dependency left in the manifest.If
tempfileisn't needed now, remove it entirely rather than leaving commented-out code. It can be re-added when needed — version control preserves the history.rust/build.rs (1)
8-8: Unused variableproject_root.
project_rootis computed on line 8 but never referenced. This is dead code.Proposed fix
let manifest_dir = env::var("CARGO_MANIFEST_DIR").expect("CARGO_MANIFEST_DIR not set"); - let project_root = Path::new(&manifest_dir).parent().expect("Failed to get project root");rust/src/core/schema_loader.rs (1)
176-188: Shelling out tocurlis fragile and non-portable.
Command::new("curl")requirescurlto be installed on the host, is not available on all platforms (notably vanilla Windows), and bypasses Rust's HTTP ecosystem (connection pooling, TLS configuration, proper error types). Consider using a lightweight HTTP client crate likeureq(blocking, minimal dependencies) orreqwest(blocking feature).Additionally, if the
curlprocess is killed or the system is under memory pressure,output()captures all ofstdoutinto memory with no size bound — a malicious or misconfigured URL could return an arbitrarily large response.java/src/main/java/net/hydrius/pacts/core/SchemaLoader.java (1)
106-106:loadRemoteSchemas()ispublic— consider restricting visibility.This method mutates internal cache state and is called automatically from the constructor. Exposing it publicly allows callers to re-trigger remote loading at any time, potentially introducing concurrency issues or inconsistent state. If external re-loading is not a requirement, reduce visibility to
privateor package-private.java/pom.xml (1)
71-75:maven-surefire-plugin3.0.0-M7 is a pre-release milestone from June 2022.Consider upgrading to a stable release. The latest stable version is 3.5.4 (September 2025).
| sources: | ||
| - "https://github.com/Project-Hydrius/Schemas/archive/refs/heads/main.zip" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tests depend on a live network call to GitHub.
If SchemaLoader fetches from this URL during test setup, tests will fail when offline, rate-limited, or if the remote repo is unavailable. Consider providing a test-specific mock/stub or bundling a small test schema locally for test isolation.
🤖 Prompt for AI Agents
In `@java/src/test/resources/application.yml` around lines 9 - 10, Tests currently
depend on a live GitHub URL in application.yml which causes flaky/failed tests
offline or when remote is unavailable; update tests to be hermetic by either (a)
bundling a small test schema into test resources and point the test config to a
local file URL instead of "https://..." or (b) mock/stub the network fetch in
SchemaLoader (e.g., stub the method that fetches the ZIP or inject a test
SchemaLoader implementation) so tests use the packaged test schema; change the
test application.yml value and/or add a test-only SchemaLoader mock and ensure
test setup uses that mock.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
16 issues found across 35 files
Prompt for AI agents (all issues)
Check if these issues are valid — if so, understand the root cause of each and fix them.
<file name="java/src/main/java/net/hydrius/pacts/core/SchemaLoader.java">
<violation number="1" location="java/src/main/java/net/hydrius/pacts/core/SchemaLoader.java:63">
P1: Constructor unconditionally calls `loadRemoteSchemas()`, which throws `IOException` if remote sources are unavailable. This makes `SchemaLoader` impossible to instantiate for local/file-system-only usage and breaks resilience when the remote server is unreachable. Consider making remote loading optional (e.g., only attempt if `application.yml` has sources configured, and log a warning instead of throwing).</violation>
<violation number="2" location="java/src/main/java/net/hydrius/pacts/core/SchemaLoader.java:92">
P2: `loadSchema` now silently returns `null` instead of throwing `IOException` on failure. Callers that relied on the exception contract may now get unexpected `NullPointerException`s. Consider throwing an exception for missing schemas, or at minimum documenting the null return clearly so callers add null checks.</violation>
<violation number="3" location="java/src/main/java/net/hydrius/pacts/core/SchemaLoader.java:126">
P1: Naive YAML parsing: every line starting with `"- "` in the entire file is treated as a source URL, not just those under the `sources:` key. This will break if the YAML file has any other list-typed values. Consider using a proper YAML parser (e.g., SnakeYAML, which is already common in Java projects) or at least track whether the current line is within the `sources:` block.</violation>
<violation number="4" location="java/src/main/java/net/hydrius/pacts/core/SchemaLoader.java:151">
P1: No connect/read timeouts configured on the HTTP connection. This can cause the application to hang indefinitely during startup if the remote server is slow or unresponsive. Add `setConnectTimeout()` and `setReadTimeout()` calls.</violation>
<violation number="5" location="java/src/main/java/net/hydrius/pacts/core/SchemaLoader.java:153">
P1: Resource leak: `HttpURLConnection`, `InputStream`, and `ZipInputStream` are not closed when an exception is caught. Use try-with-resources to ensure all streams and connections are properly closed on both success and failure paths. Additionally, `connection.disconnect()` is never called.</violation>
</file>
<file name="java/src/main/resources/application.yml">
<violation number="1" location="java/src/main/resources/application.yml:6">
P2: Avoid pulling schemas from a mutable branch head. Pin the schema source to a specific tag or commit to keep builds reproducible and prevent unexpected schema changes.</violation>
</file>
<file name=".github/workflows/ci-java.yml">
<violation number="1" location=".github/workflows/ci-java.yml:64">
P1: `steps.version` no longer exists after removing the "Create Combined Version" step, so all references to `steps.version.outputs.impl_version` will resolve to empty and break artifact renames and release metadata. Use `steps.impl.outputs.impl_version` directly (or reintroduce a `version` step).</violation>
</file>
<file name="rust/resources/sources.yaml">
<violation number="1" location="rust/resources/sources.yaml:2">
P2: Pin schema sources to a tag or commit SHA instead of the moving `main` branch to keep builds reproducible and avoid unexpected schema changes.</violation>
</file>
<file name="java/src/test/resources/application.yml">
<violation number="1" location="java/src/test/resources/application.yml:10">
P2: Pin the schema source to an immutable tag or commit so test runs are reproducible and don’t change when the upstream main branch moves.</violation>
</file>
<file name="rust/src/core/schema_loader.rs">
<violation number="1" location="rust/src/core/schema_loader.rs:27">
P2: The doc comment for `schema_root` says "kept for API compatibility (not used)", but the constructor still panics if it's empty. This is contradictory — if the parameter is unused, the validation should either be removed, or the doc should be updated to explain why a non-empty value is still required.</violation>
<violation number="2" location="rust/src/core/schema_loader.rs:121">
P1: Hardcoded relative path `rust/resources/sources.yaml` depends on the working directory being the repo root. In production or CI environments, the binary will typically be run from a different directory, causing the config to silently not be found and falling back to the hardcoded default URL. Consider using an environment variable, embedding the config at compile time, or resolving relative to the binary's location.</violation>
<violation number="3" location="rust/src/core/schema_loader.rs:128">
P2: Hand-rolled YAML parsing is fragile and will break on many valid YAML constructs. Use the `serde_yaml` crate with a proper deserialization struct instead. For example:
```rust
#[derive(Deserialize)]
struct SourcesConfig {
sources: Vec<String>,
}
let config: SourcesConfig = serde_yaml::from_reader(file)?;
This handles quoting, comments, edge cases, and is far more maintainable.
P2: JSON files with fewer than 5 path components in the ZIP are silently skipped. If the ZIP structure changes, schemas will be missing from cache with no diagnostic information. Add an `eprintln!` warning (or proper log) for skipped files so that debugging failed schema lookups is possible. P3: The PR description says the version bump is minor, but the Cargo.toml change is a patch bump (0.0.4 → 0.0.5). Please confirm the intended release level and align the version or update the PR description to avoid ambiguity.(Based on your team's feedback about handling version/description mismatches.) [FEEDBACK_USED]
(Based on your team's feedback about handling PR description/version mismatches without assuming which is correct.) [FEEDBACK_USED]
Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review.
| logger.warning("Failed to load schema from file system: " + e.getMessage()); | ||
| } | ||
|
|
||
| return null; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
P2: loadSchema now silently returns null instead of throwing IOException on failure. Callers that relied on the exception contract may now get unexpected NullPointerExceptions. Consider throwing an exception for missing schemas, or at minimum documenting the null return clearly so callers add null checks.
Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At java/src/main/java/net/hydrius/pacts/core/SchemaLoader.java, line 92:
<comment>`loadSchema` now silently returns `null` instead of throwing `IOException` on failure. Callers that relied on the exception contract may now get unexpected `NullPointerException`s. Consider throwing an exception for missing schemas, or at minimum documenting the null return clearly so callers add null checks.</comment>
<file context>
@@ -44,91 +59,164 @@ public SchemaLoader(String schemaRoot, String domain, String version) throws Ill
+ logger.warning("Failed to load schema from file system: " + e.getMessage());
}
+
+ return null;
}
</file context>
| [package] | ||
| name = "pacts" | ||
| version = "0.0.4" | ||
| version = "0.0.5" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
P3: The PR description says the version bump is minor, but the Cargo.toml change is a patch bump (0.0.4 → 0.0.5). Please confirm the intended release level and align the version or update the PR description to avoid ambiguity.
(Based on your team's feedback about handling version/description mismatches.)
Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At rust/Cargo.toml, line 3:
<comment>The PR description says the version bump is minor, but the Cargo.toml change is a patch bump (0.0.4 → 0.0.5). Please confirm the intended release level and align the version or update the PR description to avoid ambiguity.
(Based on your team's feedback about handling version/description mismatches.) </comment>
<file context>
@@ -1,8 +1,7 @@
[package]
name = "pacts"
-version = "0.0.4"
+version = "0.0.5"
edition = "2021"
-build = "build.rs"
</file context>
| <groupId>net.hydrius</groupId> | ||
| <artifactId>pacts</artifactId> | ||
| <version>1.0.0</version> | ||
| <version>1.0.1</version> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
P3: The PR description says this is a minor version bump, but the pom.xml change is a patch bump (1.0.1). Please confirm intent and update either the version or the PR description to avoid release confusion.
(Based on your team's feedback about handling PR description/version mismatches without assuming which is correct.)
Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At java/pom.xml, line 8:
<comment>The PR description says this is a minor version bump, but the pom.xml change is a patch bump (1.0.1). Please confirm intent and update either the version or the PR description to avoid release confusion.
(Based on your team's feedback about handling PR description/version mismatches without assuming which is correct.) </comment>
<file context>
@@ -5,7 +5,7 @@
<groupId>net.hydrius</groupId>
<artifactId>pacts</artifactId>
- <version>1.0.0</version>
+ <version>1.0.1</version>
<packaging>jar</packaging>
</file context>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
rust/src/core/schema_loader.rs (1)
230-232:⚠️ Potential issue | 🟡 Minor
replace("v", "")removes all occurrences ofv, not just the prefix.For a version like
"v1"this works, but any version string containingvelsewhere (e.g., hypothetically"v12dev3") would be mangled. Usetrim_start_matchesfor correctness.🐛 Proposed fix
pub fn get_parsed_version(&self) -> i32 { - self.version.replace("v", "").parse().unwrap_or(1) + self.version.trim_start_matches('v').parse().unwrap_or(1) }
🤖 Fix all issues with AI agents
In `@java/pom.xml`:
- Around line 71-75: Update the Maven Surefire plugin declaration to use the
latest stable release instead of the pre-release milestone; locate the plugin
block for org.apache.maven.plugins:maven-surefire-plugin (artifactId
maven-surefire-plugin) and change its <version> from 3.0.0-M7 to 3.5.4, then run
a quick mvn test to verify compatibility and adjust any configuration if new
defaults changed.
In `@rust/src/core/schema_loader.rs`:
- Around line 140-152: The load_schemas_from_zip_url function currently shells
out to curl; replace that with a Rust HTTP client (use ureq for blocking
simplicity or reqwest blocking) by adding the dependency and using e.g.
ureq::get(url).call() (or reqwest::blocking::get) to fetch the bytes, check the
HTTP status and return an error on non-success, then use the response
bytes/reader in place of Command::output() output handling; update error
messages to include the HTTP status/body when available and remove the curl
Command usage. Ensure you preserve existing behavior (redirects/timeouts by
configuring the client) and adjust imports and Result error conversions
accordingly.
🧹 Nitpick comments (11)
.github/workflows/ci-rust.yml (1)
50-62: Changelog generation looks solid, minor edge-case note.The heredoc-with-random-delimiter approach for multiline output is good practice for GitHub Actions. The tag-existence check and fallback to last 20 commits is reasonable.
One thing to be aware of: if the tag
rust-v<version>already exists (e.g., a re-run without bumping the version), the changelog will be empty ($TAG..HEADwith no new commits), and the release step may fail or create a duplicate. Consider guarding against an empty$LOGor adding anif:condition on the release step to skip when there are no new changes.rust/src/core/schema_loader.rs (4)
66-78:load_schemapanics on cache miss — consider returningResultinstead.Panicking at runtime for a missing schema is harsh, especially since the caller (
Validator) may want to handle a missing schema gracefully. ReturningResult<Value>(orOption<Value>) would be more idiomatic Rust and give callers the choice of how to handle the error.That said, if this is an intentional "fail-fast" design decision (schemas must always be pre-loaded), this is acceptable — but it should be documented as a contract at the call sites too.
186-189: Minor: avoid an unnecessarycloneby reordering the log and insert.
insertconsumes the key, so the clone exists only to keepcache_keyalive for theeprintln!. Swapping the order eliminates the allocation.♻️ Proposed fix
- let cache_key = format!("{}/{}/{}/{}", entry_domain, entry_version, entry_category, schema_name); - self.schema_cache.insert(cache_key.clone(), schema); - eprintln!("Loaded schema into cache: {}", cache_key); + let cache_key = format!("{}/{}/{}/{}", entry_domain, entry_version, entry_category, schema_name); + eprintln!("Loaded schema into cache: {}", cache_key); + self.schema_cache.insert(cache_key, schema);
160-192: ZIP entry path parsing is brittle — silently skips entries that don't match the expected depth.Lines 177–190 require at least 5 path components and index from the end. If the ZIP structure changes (e.g., no top-level directory, or additional nesting), entries are silently ignored. Consider logging a warning for skipped
.jsonentries so mismatches are observable.♻️ Suggested improvement
if path_parts.len() >= 5 { // ... existing logic ... + } else { + eprintln!("Skipping JSON entry with unexpected path structure: {}", entry_path); }
37-53: Constructor panics on both validation and load failure — document or soften.
new()can panic in two places: empty field validation (Line 39) and.expect()on remote load (Line 50). Sinceservice.rscallsSchemaLoader::new(...)directly, any network failure at startup will crash the process with a bare panic message. Consider returningResult<Self>to let the caller decide on the error strategy (retry, fallback, graceful shutdown).If panicking is the intended contract, this is fine — just noting it propagates to
Service::newas well.java/src/main/java/net/hydrius/pacts/core/SchemaLoader.java (6)
33-34:HashMapcache is not thread-safe for concurrent reads/writes.If
SchemaLoaderis used as a singleton (e.g., a Spring bean),loadSchemacan be called from multiple request threads concurrently, each potentially reading and writing to thecache. A plainHashMapunder concurrent access can lead to corrupted state or infinite loops (on resize). Consider usingConcurrentHashMap.♻️ Suggested fix
-import java.util.HashMap; -import java.util.Map; +import java.util.Map; +import java.util.concurrent.ConcurrentHashMap;- private final Map<String, JsonNode> cache; + private final Map<String, JsonNode> cache;- this.cache = new HashMap<>(); + this.cache = new ConcurrentHashMap<>();Also applies to: 60-60
79-98: ReplacecontainsKey+getwith a singleget+ null-check.The
containsKeyfollowed bygetis two lookups and becomes a TOCTOU issue if the map is replaced withConcurrentHashMap. Use a singleget:♻️ Suggested fix
public JsonNode loadSchema(String category, String name) { String cacheKey = domain + "/" + version + "/" + category + "/" + name; - if (cache.containsKey(cacheKey)) { - return cache.get(cacheKey); + JsonNode cached = cache.get(cacheKey); + if (cached != null) { + return cached; }
112-141: Unnecessary intermediateObjectNodewrapping — usesourcesNodedirectly.The code parses
sourcesNodefrom YAML (line 124), wraps it into a newObjectNodecalledsettings(lines 128–129), then immediately unwraps it on line 139 withsettings.get("sources"). This round-trip adds complexity without benefit.♻️ Suggested simplification
private void loadRemoteSchemas() throws IOException { - JsonNode settings = null; + JsonNode sourcesArray = null; try (InputStream stream = getClass().getClassLoader().getResourceAsStream("application.yml")) { if (stream == null) { logger.warning("Application Settings not found"); } else { ObjectMapper yamlMapper = new ObjectMapper(new YAMLFactory()); JsonNode parsed = yamlMapper.readTree(stream); if (parsed == null || !parsed.has("sources")) { logger.warning("application.yml is missing 'sources' key"); } else { JsonNode sourcesNode = parsed.get("sources"); if (!sourcesNode.isArray()) { logger.warning("'sources' key in application.yml is not an array"); } else { - settings = objectMapper.createObjectNode(); - ((ObjectNode) settings).set("sources", objectMapper.convertValue(sourcesNode, JsonNode.class)); + sourcesArray = sourcesNode; } } } } catch (Exception e) { logger.warning(() -> "Failed to parse application settings: " + e.getMessage()); } boolean sourcesLoaded = false; - if (settings != null) { - JsonNode sourcesNode = settings.get("sources"); - for (JsonNode node : sourcesNode) { + if (sourcesArray != null) { + for (JsonNode node : sourcesArray) {
156-188: Silently discarding ZIP entries that don't match the expected path structure.When
pathParts.length < 3(line 178), the entry is silently skipped — no log, no warning. If the remote ZIP layout changes or has unexpected nesting, schemas will silently fail to load, making debugging difficult.♻️ Add a debug/warning log for skipped entries
String[] pathParts = categoryName.split("/"); if (pathParts.length >= 3) { // ... existing cache logic ... + } else { + logger.warning(() -> "Skipping ZIP entry with unexpected path structure: " + entryPath); }
157-167: Consider adding a cap on total decompressed bytes across all ZIP entries.The per-entry limit (
MAX_ENTRY_SIZE = 10 MB) is good, but a malicious or oversized ZIP could still contain thousands of entries just under the limit. A total decompressed size cap (e.g., 100 MB) would provide more robust protection against zip bombs.
145-145:new URL(String)is deprecated since Java 20.While the project targets Java 17, it's worth noting for forward compatibility. When upgrading, switch to
URI.create(source).toURL().
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
rust/src/core/schema_loader.rs (1)
232-234:⚠️ Potential issue | 🟡 Minor
replace("v", "")strips all occurrences ofv, not just the prefix.For a version string like
"v12"this works, butreplaceis overly broad — e.g., a hypothetical"preview1"would become"preiew1". Usestrip_prefixfor correctness:🐛 Proposed fix
pub fn get_parsed_version(&self) -> i32 { - self.version.replace("v", "").parse().unwrap_or(1) + self.version + .strip_prefix('v') + .unwrap_or(&self.version) + .parse() + .unwrap_or(1) }
🤖 Fix all issues with AI agents
In `@java/src/main/java/net/hydrius/pacts/core/SchemaLoader.java`:
- Around line 25-29: The class Javadoc for SchemaLoader incorrectly states
remote ZIP loading is "available through a separate method" even though
loadRemoteSchemas() is private and invoked by the constructor; either change the
Javadoc to accurately say remote loading is handled internally by the private
loadRemoteSchemas() method and is auto-invoked by the constructor, or if the
intention is to expose remote loading to callers, change loadRemoteSchemas() to
public (and update its Javadoc) so the comment matches the API; update
references to SchemaLoader and loadRemoteSchemas() accordingly.
In `@rust/src/core/schema_loader.rs`:
- Around line 162-194: The loop currently uses `?` on `entry.read_to_string(&mut
content)?` and `serde_json::from_str(&content)?`, which aborts the whole
function on a single malformed entry; change per-entry error handling in the
`for` loop inside `load_schemas_from_zip_url` (the block that iterates `for i in
0..zip.len()` and uses `zip.by_index(i)`) to catch and log errors instead of
propagating them: replace the `?` calls with match/if-let that on Err logs the
error (including `entry.name()` or the index) and continues to the next entry,
so `self.schema_cache.insert(...)` is only reached for successfully
read-and-parsed JSON files and the loop completes even if some entries fail
(keep the existing cache-key creation using `entry_domain`, `entry_version`,
`entry_category`, `schema_name`).
🧹 Nitpick comments (6)
java/src/main/java/net/hydrius/pacts/core/SchemaLoader.java (4)
40-42: Consider adding a total ZIP size limit alongside the per-entry limit.
MAX_ENTRY_SIZEguards individual entries, but a ZIP with many entries each just under 10 MB could still exhaust heap. A cumulative byte counter across all entries with a reasonable cap (e.g., 50–100 MB) would provide stronger protection against ZIP bombs.Also applies to: 157-167
112-135: Simplify: usesourcesNodedirectly instead of wrapping in an intermediateObjectNode.The
settingsObjectNode (line 128–129) wrapssourcesNodeonly to unwrap it again at line 139. TheobjectMapper.convertValue(sourcesNode, JsonNode.class)call is also a no-op (JsonNode → JsonNode). You can simplify by storingsourcesNodedirectly in a local variable and iterating over it.♻️ Proposed simplification
- JsonNode settings = null; + JsonNode sourcesArray = null; try (InputStream stream = getClass().getClassLoader().getResourceAsStream("application.yml")) { if (stream == null) { logger.warning("Application Settings not found"); } else { ObjectMapper yamlMapper = new ObjectMapper(new YAMLFactory()); JsonNode parsed = yamlMapper.readTree(stream); if (parsed == null || !parsed.has("sources")) { logger.warning("application.yml is missing 'sources' key"); } else { JsonNode sourcesNode = parsed.get("sources"); if (!sourcesNode.isArray()) { logger.warning("'sources' key in application.yml is not an array"); } else { - settings = objectMapper.createObjectNode(); - ((ObjectNode) settings).set("sources", objectMapper.convertValue(sourcesNode, JsonNode.class)); + sourcesArray = sourcesNode; } } } } catch (Exception e) { logger.warning(() -> "Failed to parse application settings: " + e.getMessage()); } boolean sourcesLoaded = false; - if (settings != null) { - JsonNode sourcesNode = settings.get("sources"); - for (JsonNode node : sourcesNode) { + if (sourcesArray != null) { + for (JsonNode node : sourcesArray) {Also applies to: 137-141
30-34:HashMapcache is not thread-safe ifloadSchemais called from multiple threads.If
SchemaLoaderinstances are shared across threads (common in Spring-managed beans), concurrent reads/writes onHashMapcan cause data races. Consider usingConcurrentHashMapor documenting that the class is not thread-safe.♻️ Proposed fix
-import java.util.HashMap; -import java.util.Map; +import java.util.Map; +import java.util.concurrent.ConcurrentHashMap; ... - private final Map<String, JsonNode> cache; + private final Map<String, JsonNode> cache; ... - this.cache = new HashMap<>(); + this.cache = new ConcurrentHashMap<>();Also applies to: 79-99
53-70: Constructor performs blocking network I/O.Calling
loadRemoteSchemas()from the constructor makes object creation dependent on network availability and latency (up to 15s connect + 30s read per source). In a Spring context, this blocks bean initialization. Consider deferring remote loading to an explicitinit()method or using@PostConstruct/ lazy initialization so consumers can control when the network call happens.rust/src/core/schema_loader.rs (2)
154-155: No upper bound on response body size — a malicious or misconfigured source could exhaust memory.
read_to_endwill consume the entire HTTP response without limit. Consider capping the read (e.g., viatake()) or checkingContent-Lengthbefore reading:🛡️ Example: cap download size
+ const MAX_ZIP_SIZE: u64 = 50 * 1024 * 1024; // 50 MB + let mut bytes = Vec::new(); - response.body_mut().as_reader().read_to_end(&mut bytes)?; + response.body_mut().as_reader().take(MAX_ZIP_SIZE).read_to_end(&mut bytes)?; + + if bytes.len() as u64 >= MAX_ZIP_SIZE { + return Err(anyhow::anyhow!("ZIP from {} exceeds maximum allowed size", url)); + }
36-52: Constructor performs network I/O and panics on failure — this is difficult to test.
new()eagerly fetches remote ZIPs and panics (via.expect()) if loading fails. This makes unit testing and offline usage impossible. Consider a two-phase initialization (e.g., a fallibleSchemaLoader::load()→Result<Self>) or accepting pre-loaded schemas for testing. This is a design trade-off and acceptable if the current approach meets your needs, but worth keeping in mind.
| * SchemaLoader class that loads schemas that are bundled with Pacts. | ||
| * | ||
| * This class supports loading schemas from the file system or embedded resources. | ||
| * Remote ZIP loading functionality is available through a separate method. | ||
| */ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Javadoc says "available through a separate method" but loadRemoteSchemas() is private.
The class-level doc (line 28) states remote loading is "available through a separate method," implying a public API, but the method is private and auto-invoked by the constructor. Consider updating the doc to reflect the actual behavior.
🤖 Prompt for AI Agents
In `@java/src/main/java/net/hydrius/pacts/core/SchemaLoader.java` around lines 25
- 29, The class Javadoc for SchemaLoader incorrectly states remote ZIP loading
is "available through a separate method" even though loadRemoteSchemas() is
private and invoked by the constructor; either change the Javadoc to accurately
say remote loading is handled internally by the private loadRemoteSchemas()
method and is auto-invoked by the constructor, or if the intention is to expose
remote loading to callers, change loadRemoteSchemas() to public (and update its
Javadoc) so the comment matches the API; update references to SchemaLoader and
loadRemoteSchemas() accordingly.
| for i in 0..zip.len() { | ||
| let mut entry = zip.by_index(i)?; | ||
|
|
||
| // Process only JSON files | ||
| if !entry.is_dir() && entry.name().ends_with(".json") { | ||
| // Read the entry content | ||
| let mut content = String::new(); | ||
| entry.read_to_string(&mut content)?; | ||
|
|
||
| // Parse the JSON schema | ||
| let schema: Value = serde_json::from_str(&content)?; | ||
|
|
||
| // Extract path information to create cache key | ||
| let entry_name = entry.name().to_string(); | ||
| let path_parts: Vec<&str> = entry_name.split('/').collect(); | ||
|
|
||
| // We expect paths like: Schemas-main/bees/v1/inventory/item.json | ||
| if path_parts.len() >= 5 { | ||
| let entry_domain = path_parts[path_parts.len() - 4]; | ||
| let entry_version = path_parts[path_parts.len() - 3]; | ||
| let entry_category = path_parts[path_parts.len() - 2]; | ||
| let file_name = path_parts[path_parts.len() - 1]; | ||
|
|
||
| let schema_name = file_name.trim_end_matches(".json"); | ||
|
|
||
| let cache_key = format!("{}/{}/{}/{}", entry_domain, entry_version, entry_category, schema_name); | ||
| self.schema_cache.insert(cache_key.clone(), schema); | ||
| eprintln!("Loaded schema into cache: {}", cache_key); | ||
| } else { | ||
| eprintln!("Skipping JSON file with unexpected path structure (< 5 components): {}", entry_name); | ||
| } | ||
| } | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A single malformed JSON entry aborts processing of the remaining ZIP entries and leaves the cache partially populated.
The ? on lines 169 and 172 propagates errors out of the for loop, so if entry N of M fails parsing, entries already inserted into schema_cache remain but entries N+1..M are never processed. load_schemas_from_zip_url then returns Err, and load_remote_schemas falls through to the next source — but the cache is already partially filled from the failed source. This creates inconsistent state.
Consider catching per-entry errors and logging them rather than aborting the entire ZIP:
♻️ Suggested change: skip malformed entries
for i in 0..zip.len() {
let mut entry = zip.by_index(i)?;
// Process only JSON files
if !entry.is_dir() && entry.name().ends_with(".json") {
// Read the entry content
let mut content = String::new();
- entry.read_to_string(&mut content)?;
-
- // Parse the JSON schema
- let schema: Value = serde_json::from_str(&content)?;
+ if let Err(e) = entry.read_to_string(&mut content) {
+ eprintln!("Failed to read ZIP entry {}: {}", entry.name(), e);
+ continue;
+ }
+
+ // Parse the JSON schema
+ let schema: Value = match serde_json::from_str(&content) {
+ Ok(v) => v,
+ Err(e) => {
+ eprintln!("Failed to parse JSON in {}: {}", entry.name(), e);
+ continue;
+ }
+ };🤖 Prompt for AI Agents
In `@rust/src/core/schema_loader.rs` around lines 162 - 194, The loop currently
uses `?` on `entry.read_to_string(&mut content)?` and
`serde_json::from_str(&content)?`, which aborts the whole function on a single
malformed entry; change per-entry error handling in the `for` loop inside
`load_schemas_from_zip_url` (the block that iterates `for i in 0..zip.len()` and
uses `zip.by_index(i)`) to catch and log errors instead of propagating them:
replace the `?` calls with match/if-let that on Err logs the error (including
`entry.name()` or the index) and continues to the next entry, so
`self.schema_cache.insert(...)` is only reached for successfully read-and-parsed
JSON files and the loop completes even if some entries fail (keep the existing
cache-key creation using `entry_domain`, `entry_version`, `entry_category`,
`schema_name`).
For the sake of less cohesion, schemas and the implementations have split repositories. With this change, implementations must pull schemas from sources as zip files. Version bump marked as patch, as this needs proper testing and improvements.
Summary by cubic
Implementations now load JSON schemas from remote ZIP sources configured in the app, replacing bundled schemas. CI builds publish packages using implementation versions only, and workflows now watch implementation paths.
New Features
Migration
Written for commit b105e30. Summary will update on new commits.
Summary by CodeRabbit
New Features
Refactor
Chores
Tests