fix(prefect-gcp): prevent double-nesting in GcsBucket._resolve_path by tomerqodo · Pull Request #18 · qodo-benchmark/prefect-combined-coderabbit

tomerqodo · 2026-01-22T05:21:32Z

Summary by CodeRabbit

Bug Fixes
- Resolved redundant prefixing of bucket folders in storage paths.
- Added validation to prevent bucket folder from matching the bucket name.
Tests
- Added regression test for path handling correctness.
Chores
- Updated test environment warning filters and timeout configuration.

_{✏️ Tip: You can customize this high-level summary in your review settings.}

When storage_block_id is null (e.g., context serialized to Ray workers), create_result_record() adds bucket_folder to storage_key via _resolve_path. Then write_path() calls _resolve_path again, causing double-nested paths like "results/results/abc123" instead of "results/abc123". Add duplicate-prefix check to _resolve_path, matching the existing check in _join_bucket_folder. Fixes PrefectHQ#20174 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

- fix .prefectignore content in test fixture (remove leading whitespace) - add filterwarnings to pyproject.toml to suppress upstream deprecation warnings 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

remove extra disable_logging fixture and temporary_settings wrapper to match pattern used by prefect-aws and other integrations 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

fixes ephemeral server startup timeouts on Python 3.11+ with xdist workers see PrefectHQ#16397 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

coderabbitai · 2026-01-22T05:22:00Z

Walkthrough

This PR adds cross-field validation to the GcsBucket class ensuring bucket_folder differs from bucket, implements a guard in _resolve_path to prevent duplicate path prefixing, updates test configuration by removing automatic logging disablement and adding server timeout settings, and introduces a regression test for the path nesting fix.

Changes

Cohort / File(s)	Summary
Cross-field validation and path resolution `src/integrations/prefect-gcp/prefect_gcp/cloud_storage.py`, `src/integrations/prefect-gcp/tests/test_cloud_storage.py`	Added `model_validator` import and cross-field validation to `bucket_folder` validator to enforce that bucket_folder ≠ bucket using info.data. Updated `_bucket_folder_suffix` method signature to accept `info` parameter. Introduced guard in `_resolve_path` to skip re-prefixing when bucket_folder already present in path. Added regression test `test_resolve_path_no_double_nesting` validating path resolution prevents double-nesting.
Test configuration updates `src/integrations/prefect-gcp/tests/conftest.py`, `src/integrations/prefect-gcp/pyproject.toml`	Removed `disable_logging` fixture and related imports. Updated `prefect_db` fixture to invoke `prefect_test_harness` with `server_timeout=60`. Replaced single DeprecationWarning ignore rule with two broader ignores targeting httplib2 and pathspec deprecation warnings.
Test file reformatting `src/integrations/prefect-gcp/tests/projects/test_steps.py`	Reformatted multi-line `.prefectignore` string to single-line format with newline separators; no functional change.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Poem

🐰 A hop through validation's garden fair,
Where bucket_folder paths take extra care,
No double-nesting leads us astray,
Cross-field checks keep bugs at bay!
With tests to prove the guard works tight, ✨

🚥 Pre-merge checks | ✅ 1 | ❌ 2

❌ Failed checks (2 warnings)

Check name	Status	Explanation	Resolution
Description check	⚠️ Warning	The pull request description is minimal and provides only a reference to a benchmark PR without substantial detail about the changes, testing, or rationale.	Expand the description to include an overview of proposed changes, explain the problem being fixed, note which tests were added, and ensure all checklist items are addressed.
Docstring Coverage	⚠️ Warning	Docstring coverage is 42.86% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (1 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title clearly and specifically describes the main change: preventing double-nesting in GcsBucket._resolve_path, which is the primary fix across the changeset.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing touches

📝 Generate docstrings

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 4

🤖 Fix all issues with AI agents

In `@src/integrations/prefect-gcp/prefect_gcp/cloud_storage.py`:
- Around line 726-735: The current check uses "self.bucket_folder in path" which
matches substrings anywhere; change it to a path-prefix-aware check: normalize
leading/trailing slashes and then test whether the path equals the bucket_folder
or startswith bucket_folder + "/" (e.g., normalized_path ==
normalized_bucket_folder or normalized_path.startswith(normalized_bucket_folder
+ "/")). Update the logic in the same method where self.bucket_folder and path
are used (the _resolve_path/write_path area) to use this prefix check so only
true prefixes are skipped.
- Line 10: The import model_validator is unused because the cross-field
validator _bucket_folder_suffix currently uses `@field_validator`; change that to
a model-level validator: replace the `@field_validator` decorator on
_bucket_folder_suffix with `@model_validator`(mode="after") and adjust the
function signature to accept (cls, values) (or (self, values) depending on
classmethod/instance semantics) and return the validated values so the
cross-field logic runs correctly; alternatively, if you opt not to convert the
validator, simply remove model_validator from the import list to fix the lint
error.

In `@src/integrations/prefect-gcp/tests/conftest.py`:
- Around line 14-19: The fixture prefect_db is passing an invalid keyword
argument to prefect_test_harness (server_timeout) which will raise a TypeError;
change the argument name to server_startup_timeout when calling
prefect_test_harness in the prefect_db fixture so it matches the function
signature and increases the startup timeout for CI.

In `@src/integrations/prefect-gcp/tests/test_cloud_storage.py`:
- Around line 143-165: The test fixture builds an already-prefixed path with an
extra slash because bucket_folder contains a trailing '/', so update
test_resolve_path_no_double_nesting to construct already_prefixed_path without
adding another '/' (e.g., concatenate bucket_folder and "abc123" directly or
strip/normalize the trailing slash) before calling gcs_bucket._resolve_path;
ensure assertions still check equality to that normalized already_prefixed_path
and that it does not start with bucket_folder repeated.

📜 Review details

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between bda7a19 and 9077e3a.

📒 Files selected for processing (5)

src/integrations/prefect-gcp/prefect_gcp/cloud_storage.py
src/integrations/prefect-gcp/pyproject.toml
src/integrations/prefect-gcp/tests/conftest.py
src/integrations/prefect-gcp/tests/projects/test_steps.py
src/integrations/prefect-gcp/tests/test_cloud_storage.py

🧰 Additional context used

🧬 Code graph analysis (3)

src/integrations/prefect-gcp/tests/projects/test_steps.py (1)

src/prefect/utilities/dockerutils.py (1)

write_text (298-307)

src/integrations/prefect-gcp/tests/test_cloud_storage.py (1)

src/integrations/prefect-gcp/prefect_gcp/cloud_storage.py (1)

_resolve_path (715-743)

src/integrations/prefect-gcp/tests/conftest.py (1)

src/prefect/testing/utilities.py (1)

prefect_test_harness (114-197)

🪛 GitHub Actions: Integrations Packages Tests

src/integrations/prefect-gcp/tests/conftest.py

[error] 19-19: TypeError: prefect_test_harness() got an unexpected keyword argument 'server_timeout'. This prevented test harness setup for Prefect integration tests.

🪛 GitHub Actions: Static analysis

src/integrations/prefect-gcp/prefect_gcp/cloud_storage.py

[error] 1-1: ruff-check: 1 error fixed (unused import). Removed unused 'model_validator' import; file modified by pre-commit hook. Command: pre-commit run --show-diff-on-failure --color=always --all-files.

🪛 Ruff (0.14.13)

src/integrations/prefect-gcp/prefect_gcp/cloud_storage.py

711-711: Avoid specifying long messages outside the exception class

(TRY003)

🔇 Additional comments (2)

src/integrations/prefect-gcp/pyproject.toml (1)

103-105: Warning filter updates look fine.

src/integrations/prefect-gcp/tests/projects/test_steps.py (1)

57-57: LGTM — ignore patterns remain covered.

_{✏️ Tip: You can disable this entire section by setting review_details to false in your review settings.}

coderabbitai · 2026-01-22T05:31:38Z

src/integrations/prefect-gcp/prefect_gcp/cloud_storage.py

 from typing import Any, BinaryIO, Dict, List, Optional, Tuple, Union

-from pydantic import Field, field_validator
+from pydantic import Field, field_validator, model_validator


⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash # First, let's check the file exists and read the import statement around line 10 echo "=== Checking import statement at line 10 ===" sed -n '1,20p' src/integrations/prefect-gcp/prefect_gcp/cloud_storage.py # Check for any usage of model_validator in the file echo -e "\n=== Searching for model_validator usage ===" rg -n 'model_validator' src/integrations/prefect-gcp/prefect_gcp/cloud_storage.py # Check what's at lines 698-713 (the mentioned additional location) echo -e "\n=== Checking lines 698-713 ===" sed -n '698,713p' src/integrations/prefect-gcp/prefect_gcp/cloud_storage.py # Count total lines in file for context echo -e "\n=== File info ===" wc -l src/integrations/prefect-gcp/prefect_gcp/cloud_storage.py

Repository: qodo-benchmark/prefect-combined-coderabbit

Length of output: 1902

Fix unused model_validator import (CI lint failure).

The model_validator import on line 10 is unused. The cross-field validation in _bucket_folder_suffix (lines 698-713) currently uses @field_validator but should use @model_validator for proper Pydantic validation semantics. Either remove the unused import or refactor the validation to actually use @model_validator.

Option 1: Remove unused import

-from pydantic import Field, field_validator, model_validator +from pydantic import Field, field_validator

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

from pydantic import Field, field_validator, model_validator

from pydantic import Field, field_validator

🤖 Prompt for AI Agents

In `@src/integrations/prefect-gcp/prefect_gcp/cloud_storage.py` at line 10, The import model_validator is unused because the cross-field validator _bucket_folder_suffix currently uses `@field_validator`; change that to a model-level validator: replace the `@field_validator` decorator on _bucket_folder_suffix with `@model_validator`(mode="after") and adjust the function signature to accept (cls, values) (or (self, values) depending on classmethod/instance semantics) and return the validated values so the cross-field logic runs correctly; alternatively, if you opt not to convert the validator, simply remove model_validator from the import list to fix the lint error.

coderabbitai · 2026-01-22T05:31:38Z

src/integrations/prefect-gcp/prefect_gcp/cloud_storage.py

        # If bucket_folder provided, it means we won't write to the root dir of
        # the bucket. So we need to add it on the front of the path.
+        #
+        # However, avoid double-nesting if path is already prefixed with bucket_folder.
+        # This can happen when storage_block_id is null (e.g., context serialized to
+        # remote workers), causing create_result_record() to add bucket_folder to
+        # storage_key, then write_path() calls _resolve_path again.
+        # See https://github.com/PrefectHQ/prefect/issues/20174
+        if self.bucket_folder and self.bucket_folder in path:
+            return path


⚠️ Potential issue | 🟠 Major

Avoid substring matches when checking for existing prefixes.
self.bucket_folder in path will skip prefixing when the folder name appears later in the path (e.g., archive/results/file). Use a prefix/path-aware check instead.

🛠️ Proposed fix

- if self.bucket_folder and self.bucket_folder in path: + if self.bucket_folder and path.startswith(self.bucket_folder): return path

🤖 Prompt for AI Agents

In `@src/integrations/prefect-gcp/prefect_gcp/cloud_storage.py` around lines 726 - 735, The current check uses "self.bucket_folder in path" which matches substrings anywhere; change it to a path-prefix-aware check: normalize leading/trailing slashes and then test whether the path equals the bucket_folder or startswith bucket_folder + "/" (e.g., normalized_path == normalized_bucket_folder or normalized_path.startswith(normalized_bucket_folder + "/")). Update the logic in the same method where self.bucket_folder and path are used (the _resolve_path/write_path area) to use this prefix check so only true prefixes are skipped.

coderabbitai · 2026-01-22T05:31:39Z

src/integrations/prefect-gcp/tests/conftest.py

 @pytest.fixture(scope="session", autouse=True)
 def prefect_db():
-    with prefect_test_harness():
-        yield
-
-
-@pytest.fixture(scope="session", autouse=True)
-def disable_logging():
-    with temporary_settings({PREFECT_LOGGING_TO_API_ENABLED: False}):
+    # Increase timeout for CI environments where multiple xdist workers
+    # start servers simultaneously, which can be slower on Python 3.11+
+    # See https://github.com/PrefectHQ/prefect/issues/16397
+    with prefect_test_harness(server_timeout=60):


⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash # Find the definition of prefect_test_harness rg -n "def prefect_test_harness" -g "*.py" -B 2 -A 10 # Find all usages to understand the pattern rg -n "prefect_test_harness\(" -g "*.py" -B 1 -A 1

Repository: qodo-benchmark/prefect-combined-coderabbit

Length of output: 7663

Use the correct prefect_test_harness parameter name.

server_timeout is not a valid argument. The correct parameter is server_startup_timeout as defined in the function signature. Using an invalid keyword argument will cause a TypeError at runtime.

🛠️ Proposed fix

- with prefect_test_harness(server_timeout=60): + with prefect_test_harness(server_startup_timeout=60):

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

@pytest.fixture(scope="session", autouse=True)

def prefect_db():

with prefect_test_harness():

yield

@pytest.fixture(scope="session", autouse=True)

def disable_logging():

with temporary_settings({PREFECT_LOGGING_TO_API_ENABLED: False}):

# Increase timeout for CI environments where multiple xdist workers

# start servers simultaneously, which can be slower on Python 3.11+

# See https://github.com/PrefectHQ/prefect/issues/16397

with prefect_test_harness(server_timeout=60):

`@pytest.fixture`(scope="session", autouse=True)

def prefect_db():

# Increase timeout for CI environments where multiple xdist workers

# start servers simultaneously, which can be slower on Python 3.11+

# See https://github.com/PrefectHQ/prefect/issues/16397

with prefect_test_harness(server_startup_timeout=60):

🧰 Tools

🪛 GitHub Actions: Integrations Packages Tests

[error] 19-19: TypeError: prefect_test_harness() got an unexpected keyword argument 'server_timeout'. This prevented test harness setup for Prefect integration tests.

🤖 Prompt for AI Agents

In `@src/integrations/prefect-gcp/tests/conftest.py` around lines 14 - 19, The fixture prefect_db is passing an invalid keyword argument to prefect_test_harness (server_timeout) which will raise a TypeError; change the argument name to server_startup_timeout when calling prefect_test_harness in the prefect_db fixture so it matches the function signature and increases the startup timeout for CI.

coderabbitai · 2026-01-22T05:31:39Z

src/integrations/prefect-gcp/tests/test_cloud_storage.py

+    def test_resolve_path_no_double_nesting(self, gcs_bucket):
+        """
+        Regression test for https://github.com/PrefectHQ/prefect/issues/20174
+
+        When storage_block_id is null (e.g., context serialized to Ray workers),
+        create_result_record() adds bucket_folder to storage_key via _resolve_path.
+        Then write_path() calls _resolve_path again. Without the duplicate check,
+        this causes double-nested paths like "results/results/abc123".
+        """
+        bucket_folder = gcs_bucket.bucket_folder
+        if not bucket_folder:
+            pytest.skip("Test only applies when bucket_folder is set")
+
+        # Simulate path that already has bucket_folder prefix
+        # (as would happen when create_result_record calls _resolve_path)
+        already_prefixed_path = f"{bucket_folder}/abc123"
+
+        # When write_path calls _resolve_path again, it should NOT double-nest
+        result = gcs_bucket._resolve_path(already_prefixed_path)
+
+        # Should return the same path, not bucket_folder/bucket_folder/abc123
+        assert result == already_prefixed_path
+        assert not result.startswith(f"{bucket_folder}{bucket_folder}")


⚠️ Potential issue | 🟡 Minor

Avoid double slash in the already‑prefixed path fixture.
bucket_folder already includes a trailing /, so interpolating /{...} produces // and isn’t representative of actual paths.

🛠️ Proposed fix

- already_prefixed_path = f"{bucket_folder}/abc123" + already_prefixed_path = str(PurePosixPath(bucket_folder) / "abc123")

🤖 Prompt for AI Agents

In `@src/integrations/prefect-gcp/tests/test_cloud_storage.py` around lines 143 - 165, The test fixture builds an already-prefixed path with an extra slash because bucket_folder contains a trailing '/', so update test_resolve_path_no_double_nesting to construct already_prefixed_path without adding another '/' (e.g., concatenate bucket_folder and "abc123" directly or strip/normalize the trailing slash) before calling gcs_bucket._resolve_path; ensure assertions still check equality to that normalized already_prefixed_path and that it does not start with bucket_folder repeated.

zzstoatzz and others added 6 commits January 21, 2026 15:46

increase test harness timeout to 60s for CI stability

0544677

fixes ephemeral server startup timeouts on Python 3.11+ with xdist workers see PrefectHQ#16397 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

chore(prefect-gcp): trim warning filters

ae8cd6a

update pr

9077e3a

coderabbitai bot reviewed Jan 22, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(prefect-gcp): prevent double-nesting in GcsBucket._resolve_path#18

tomerqodo commented Jan 22, 2026 •

edited by coderabbitai bot

Loading

Uh oh!

coderabbitai bot commented Jan 22, 2026 •

edited

Loading

Uh oh!

coderabbitai bot left a comment

Uh oh!

coderabbitai bot Jan 22, 2026

Uh oh!

coderabbitai bot Jan 22, 2026

Uh oh!

coderabbitai bot Jan 22, 2026

Uh oh!

coderabbitai bot Jan 22, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

	from pydantic import Field, field_validator, model_validator
	from pydantic import Field, field_validator

Conversation

tomerqodo commented Jan 22, 2026 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Jan 22, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Poem

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Jan 22, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Jan 22, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Jan 22, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Jan 22, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

tomerqodo commented Jan 22, 2026 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Jan 22, 2026 •

edited

Loading