Skip to content
This repository was archived by the owner on Jan 23, 2026. It is now read-only.

Conversation

@michalskrivanek
Copy link
Contributor

@michalskrivanek michalskrivanek commented Sep 30, 2025

When lease is not available we try for 300s to acquire it. Users can also iterrupt this with Ctrl+C while waiting. Let's remove such lease requests since after termination the shell is gone and it leaves behind a mostly useless lease.

Fixes-Issue: jumpstarter-dev/jumpstarter#119
Fixes-Issue: jumpstarter-dev/jumpstarter#120

Summary by CodeRabbit

  • New Features

    • Configurable lease acquisition timeout in client settings (default 2 hours) and reflected in saved config.
    • Signal-aware shell orchestration with cancellation-aware exit codes and continued exporter log streaming.
    • Automatic re-authentication for expired credentials with unified error reporting.
  • Bug Fixes

    • More reliable lease acquisition with invalid-state detection and reduced polling frequency.
    • Improved cleanup on exit with guarded deletion and timeout warnings.
  • Documentation

    • Clarified shell launch context docstring.

@netlify
Copy link

netlify bot commented Sep 30, 2025

Deploy Preview for jumpstarter-docs ready!

Name Link
🔨 Latest commit f20fd63
🔍 Latest deploy log https://app.netlify.com/projects/jumpstarter-docs/deploys/68dd1de09230b600081fc553
😎 Deploy Preview https://deploy-preview-671--jumpstarter-docs.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify project configuration.

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Sep 30, 2025

Walkthrough

Adds async orchestration for CLI shell execution with signal-aware cancellation, lease acquisition with configurable timeout and exporter tracking, exporter-log streaming integration, expanded exception handling for BaseExceptionGroup and reauthentication helpers, client config for lease timeouts, and corresponding test updates.

Changes

Cohort / File(s) Summary
CLI shell orchestration
packages/jumpstarter-cli/jumpstarter_cli/shell.py
Reworks shell entry to an anyio-driven async flow with signal handling; adds _shell_with_signal_handling and _run_shell_with_lease; coordinates lease acquisition, optional exporter-log streaming, remote shell launch, and maps cancellation/timeout to exit codes.
Lease acquisition and lifecycle
packages/jumpstarter/jumpstarter/client/lease.py
Adds acquisition_timeout and exporter_name fields; uses fail_after(self.acquisition_timeout) during acquire with 5s poll interval; handles Invalid/Unsatisfiable states, raises LeaseError on timeout/invalid, eagerly initializes via request_async, and shields cleanup with a 30s fail_after.
Exception handling and reauth
packages/jumpstarter-cli-common/jumpstarter_cli_common/exceptions.py
Extends handlers to support BaseExceptionGroup; adds helpers _handle_connection_error_with_reauth, _handle_single_exception_with_reauth, _handle_exception_group_with_reauth; exposes handle_exceptions_with_reauthentication; updates leaf_exceptions signature and traceback combination logic.
Client config for leases
packages/jumpstarter/jumpstarter/config/client.py
Adds ClientConfigV1Alpha1Lease model with acquisition_timeout (default 7200) and nests leases into ClientConfigV1Alpha1; passes configured acquisition_timeout into Lease construction.
Config serialization tests
packages/jumpstarter/jumpstarter/config/client_config_test.py
Updates three tests’ expected YAML to include shell.leases.acquisition_timeout: 7200.
Docs / small tweak
packages/jumpstarter/jumpstarter/common/utils.py
Clarifies launch_shell docstring that context can be an exporter name; no logic changes.

Sequence Diagram(s)

sequenceDiagram
  autonumber
  actor User
  participant CLI as CLI command
  participant AnyIO as anyio.run
  participant Portal as BlockingPortal / thread
  participant Lease as Lease manager
  participant Exporter as Exporter logs
  participant Shell as Remote shell

  User->>CLI: invoke shell
  CLI->>AnyIO: start _shell_with_signal_handling()
  AnyIO->>Portal: enter portal to run blocking lease task
  AnyIO->>Lease: acquire (timeout = config.leases.acquisition_timeout)
  alt lease acquired
    par stream logs (optional)
      Portal->>Exporter: stream exporter logs
    and launch shell
      Portal->>Shell: start remote shell session
      Shell-->>Portal: exit status
    end
  else invalid / timeout
    Lease-->>Portal: raises LeaseError / TimeoutError
  end
  AnyIO->>AnyIO: signal handler -> cancel task group
  AnyIO->>Lease: cleanup (shielded, 30s)
  AnyIO-->>CLI: return mapped exit code (0 / 2 / error)
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related PRs

  • jumpstarter-dev/jumpstarter#293 — Overlaps Lease acquisition logic and error handling changes (LeaseError, timeout behavior).
  • jumpstarter-dev/jumpstarter#365 — Related updates to lease wiring and client config for lease parameters.
  • jumpstarter-dev/jumpstarter#559 — Touches exception-handling and reauthentication paths similar to the new helpers and BaseExceptionGroup handling.

Suggested reviewers

  • mangelajo

Poem

hop-hop, I hold the lease in queue,
I watch for signals—soft and true.
logs whisper low while shells take flight,
cancel scopes cuddle through the night.
configs bloom a timeout seed—quick nibble, tidy deed. 🐇✨

Pre-merge checks and finishing touches

❌ Failed checks (1 warning, 1 inconclusive)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 44.44% which is insufficient. The required threshold is 80.00%. You can run @coderabbitai generate docstrings to improve docstring coverage.
Title Check ❓ Inconclusive The title “Lease handling user enhancements” is related to lease handling but is overly generic and does not clearly convey the primary change of cleaning up temporary lease requests when a user interrupts the operation. It does not highlight the key fix—removing persistent lease requests on Ctrl+C—and uses non-descriptive phrasing that may not inform a reviewer scanning history of the main improvement. Consider renaming the pull request to clearly reflect the core change, such as “Clean up temporary lease requests on user interrupt” or “Remove persistent lease requests on Ctrl+C,” to make the purpose and impact of the changes immediately apparent.
✅ Passed checks (1 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
✨ Finishing touches
  • 📝 Generate Docstrings
🧪 Generate unit tests
  • Create PR with unit tests
  • Post copyable unit tests in a comment

📜 Recent review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 32d728b and f20fd63.

📒 Files selected for processing (5)
  • packages/jumpstarter-cli-common/jumpstarter_cli_common/exceptions.py (4 hunks)
  • packages/jumpstarter-cli/jumpstarter_cli/shell.py (3 hunks)
  • packages/jumpstarter/jumpstarter/client/lease.py (4 hunks)
  • packages/jumpstarter/jumpstarter/config/client.py (3 hunks)
  • packages/jumpstarter/jumpstarter/config/client_config_test.py (3 hunks)
🧰 Additional context used
🧬 Code graph analysis (3)
packages/jumpstarter/jumpstarter/client/lease.py (3)
packages/jumpstarter/jumpstarter/common/condition.py (4)
  • condition_true (26-27)
  • condition_message (16-23)
  • condition_false (30-31)
  • condition_present_and_equal (6-13)
packages/jumpstarter/jumpstarter/client/exceptions.py (1)
  • LeaseError (4-5)
packages/jumpstarter/jumpstarter/client/grpc.py (1)
  • DeleteLease (383-389)
packages/jumpstarter-cli-common/jumpstarter_cli_common/exceptions.py (1)
packages/jumpstarter/jumpstarter/common/exceptions.py (3)
  • JumpstarterException (4-39)
  • get_config (32-33)
  • ConnectionError (42-45)
packages/jumpstarter-cli/jumpstarter_cli/shell.py (3)
packages/jumpstarter-cli-common/jumpstarter_cli_common/signal.py (1)
  • signal_handler (9-22)
packages/jumpstarter/jumpstarter/config/client.py (2)
  • lease (135-143)
  • lease_async (247-284)
packages/jumpstarter/jumpstarter/common/utils.py (1)
  • launch_shell (49-127)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (11)
  • GitHub Check: Redirect rules - jumpstarter-docs
  • GitHub Check: Header rules - jumpstarter-docs
  • GitHub Check: pytest-matrix (macos-15, 3.11)
  • GitHub Check: e2e
  • GitHub Check: Pages changed - jumpstarter-docs
  • GitHub Check: build
  • GitHub Check: pytest-matrix (macos-15, 3.13)
  • GitHub Check: pytest-matrix (ubuntu-24.04, 3.12)
  • GitHub Check: pytest-matrix (ubuntu-24.04, 3.11)
  • GitHub Check: pytest-matrix (ubuntu-24.04, 3.13)
  • GitHub Check: pytest-matrix (macos-15, 3.12)
🔇 Additional comments (5)
packages/jumpstarter/jumpstarter/client/lease.py (5)

12-19: LGTM! Imports align with new timeout and cancellation handling.

The added imports (CancelScope, fail_after, sleep) are correctly used in the refactored _acquire method and cleanup logic.


50-51: LGTM! New fields support configurable timeout and exporter tracking.

The acquisition_timeout field enables flexible lease acquisition timeouts (default 2 hours), and exporter_name captures the exporter during acquisition. Both fields are properly initialized and documented.


69-69: LGTM! Log message accurately reflects the acquisition phase.

The updated message "Acquiring lease" is more precise, as it logs after the lease request is created and before polling begins.


119-156: LGTM! Acquisition timeout and condition handling implemented correctly.

The refactored _acquire method now:

  • Enforces a configurable timeout using fail_after(self.acquisition_timeout)
  • Handles the new "Invalid" condition (lines 136-139)
  • Polls every 5 seconds (increased from 1s for reduced server load)
  • Captures exporter_name on successful acquisition (line 127)
  • Converts TimeoutError to LeaseError with clear messaging (lines 152-156)

The condition checks are comprehensive (Ready, Unsatisfiable, Invalid, Pending, Released) and align with the PR objective of cleaning up temporary lease requests on user interruption.


164-174: LGTM! Cleanup properly guards against cancellation and missing lease names.

The cleanup logic now:

  • Guards deletion with if self.release and self.name: (line 164), preventing attempts to delete a lease before self.name is set (addresses past review comment)
  • Uses CancelScope(shield=True) (line 167) to ensure cleanup completes even if the user interrupts with Ctrl+C
  • Enforces a 30-second timeout (line 169) to prevent hanging
  • Logs a warning if cleanup times out (line 174)

This implementation aligns with the PR objective of cleaning up temporary lease requests when users interrupt the operation.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@michalskrivanek
Copy link
Contributor Author

fixes jumpstarter-dev/jumpstarter#119

@michalskrivanek
Copy link
Contributor Author

@coderabbitai review

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Sep 30, 2025

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (3)
packages/jumpstarter/jumpstarter/client/lease.py (2)

148-149: Consider moving CancelScope to top-level imports.

While the local import works correctly, moving CancelScope to the top-level anyio import block (lines 12-18) would improve consistency since other anyio symbols are already imported there.

Apply this diff to move the import:

 from anyio import (
     AsyncContextManagerMixin,
     ContextManagerMixin,
+    CancelScope,
     create_task_group,
     fail_after,
     sleep,
 )

And remove the local import:

     @asynccontextmanager
     async def __asynccontextmanager__(self) -> AsyncGenerator[Self]:
-        from anyio import CancelScope
-
         try:

156-160: Consider adding a timeout within the cancellation shield.

While the cancellation shield ensures cleanup completes, if DeleteLease hangs indefinitely, it could delay application shutdown. Consider adding a timeout within the shielded scope for robustness.

Apply this diff to add a timeout:

                 # Shield cleanup from cancellation to ensure it completes
                 with CancelScope(shield=True):
-                    await self.svc.DeleteLease(
-                        name=self.name,
-                    )
+                    with fail_after(30):  # Timeout for cleanup
+                        await self.svc.DeleteLease(
+                            name=self.name,
+                        )
packages/jumpstarter-cli/jumpstarter_cli/shell.py (1)

49-63: Consider using async context managers directly.

Since lease_acquisition is already async, you could simplify by using async context managers (serve_unix_async, monitor_async, connect_async) directly instead of creating a BlockingPortal and running sync wrappers in a thread. This would reduce complexity and eliminate the thread boundary.

Example refactor:

async def lease_acquisition():
    nonlocal exit_code
    try:
        async with config.lease_async(selector, lease_name, duration) as lease:
            async with lease.serve_unix_async() as path:
                async with lease.monitor_async():
                    if exporter_logs:
                        from contextlib import AsyncExitStack
                        async with AsyncExitStack() as stack:
                            async with lease.connect_async(stack) as client:
                                async with client.log_stream():
                                    exit_code = await anyio.to_thread.run_sync(shell_launcher, path)
                    else:
                        exit_code = await anyio.to_thread.run_sync(shell_launcher, path)
    except BaseExceptionGroup as eg:
        for exc in eg.exceptions:
            if isinstance(exc, TimeoutError):
                raise exc from None
        raise
    except get_cancelled_exc_class():
        exit_code = 2

Note: This requires verifying that config.lease_async doesn't require a portal parameter, or adjusting the signature accordingly.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 748be71 and ebed8db.

📒 Files selected for processing (2)
  • packages/jumpstarter-cli/jumpstarter_cli/shell.py (3 hunks)
  • packages/jumpstarter/jumpstarter/client/lease.py (3 hunks)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
  • GitHub Check: pytest-matrix (macos-15, 3.11)
  • GitHub Check: pytest-matrix (macos-15, 3.13)
🔇 Additional comments (8)
packages/jumpstarter/jumpstarter/client/lease.py (2)

12-18: LGTM! Import formatting improved.

The anyio imports are properly formatted and all are used within the file.


154-160: Cancellation shield correctly ensures cleanup completes.

The addition of CancelScope(shield=True) around DeleteLease ensures that lease cleanup completes even when the task is cancelled, which directly addresses the PR objective of cleaning up leases on timeout and termination.

packages/jumpstarter-cli/jumpstarter_cli/shell.py (6)

4-4: LGTM! New imports are properly used.

The new anyio and signal handling imports are correctly utilized in the refactored async flow.

Also applies to: 6-6, 9-9


18-29: LGTM! Helper function correctly encapsulates shell launching.

The closure properly captures configuration and command parameters for later execution.


32-41: LGTM! Context manager nesting is correct.

The function properly orchestrates lease context managers and conditionally enables log streaming.


44-76: Async orchestration correctly implements signal-driven cancellation.

The function properly coordinates lease acquisition, shell execution, signal handling, and cleanup with appropriate exit codes for different termination scenarios.


62-63: LGTM! Cancellation exception handling is correct.

The handler properly catches backend-specific cancellation exceptions via get_cancelled_exc_class() and sets the appropriate exit code for signal-driven termination.


105-108: LGTM! Integration with async flow is correct.

The shell command properly delegates to the new async orchestration and exits with the appropriate code.

@michalskrivanek
Copy link
Contributor Author

@coderabbitai review

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Sep 30, 2025

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (1)
packages/jumpstarter-cli/jumpstarter_cli/shell.py (1)

39-39: The cancellation exception handling is correct; past review comment is incorrect.

Lines 39 and 56-57 correctly handle cancellation:

  • Line 39: get_cancelled_exc_class() returns the exception class itself (either asyncio.CancelledError or trio.Cancelled depending on backend)
  • Line 56: The stored class is used directly in the except clause

This pattern is valid and equivalent to calling get_cancelled_exc_class() directly in the except clause. The past review suggesting get_cancelled_exc_class()() is incorrect—the function returns a class, not a factory function.

Based on learnings

Also applies to: 56-57

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between ebed8db and 33dc02b.

📒 Files selected for processing (2)
  • packages/jumpstarter-cli/jumpstarter_cli/shell.py (3 hunks)
  • packages/jumpstarter/jumpstarter/client/lease.py (3 hunks)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (6)
  • GitHub Check: e2e
  • GitHub Check: pytest-matrix (macos-15, 3.13)
  • GitHub Check: pytest-matrix (ubuntu-24.04, 3.11)
  • GitHub Check: pytest-matrix (ubuntu-24.04, 3.12)
  • GitHub Check: pytest-matrix (ubuntu-24.04, 3.13)
  • GitHub Check: pytest-matrix (macos-15, 3.11)
🔇 Additional comments (6)
packages/jumpstarter/jumpstarter/client/lease.py (2)

12-19: LGTM!

The added imports (CancelScope, fail_after) correctly support the cancellation-shielded cleanup logic introduced in the __asynccontextmanager__ method.


147-163: Excellent cancellation-safe cleanup implementation!

The changes ensure robust lease cleanup:

  1. Line 150: Moving request_async() inside the try block guarantees cleanup runs even if acquisition fails
  2. Lines 155-163: CancelScope(shield=True) correctly prevents outer cancellation from interrupting the DeleteLease operation
  3. Line 158: The 30-second fail_after timeout prevents indefinite blocking during cleanup
  4. Lines 162-163: Appropriate warning on timeout—the server will eventually clean up orphaned leases

This aligns with anyio best practices for resource cleanup under cancellation.

Based on learnings

packages/jumpstarter-cli/jumpstarter_cli/shell.py (4)

4-4: LGTM!

The new imports correctly support the async orchestration: anyio for the event loop, create_task_group and get_cancelled_exc_class for structured concurrency and cancellation handling, and signal_handler for graceful shutdown.

Also applies to: 6-6, 9-9


18-33: LGTM!

The function correctly extracts the synchronous lease context management chain, preserving the original behavior while enabling integration with the async orchestration. The nested launch_remote_shell helper appropriately captures the path parameter.


36-64: Excellent async orchestration with proper signal handling!

The function correctly implements structured concurrency with three exit paths:

  1. Normal exit (code 0): Shell completes successfully
  2. Timeout (code 1): Lease acquisition timeout (300s in _acquire) extracted from BaseExceptionGroup and caught by except* TimeoutError
  3. Cancellation (code 2): Signal handler cancels the scope, raising a cancellation exception

The nested try-finally structure ensures the task group is always cancelled, preventing the signal handler task from lingering after normal completion.

Based on learnings


93-96: LGTM!

The integration point correctly launches the async orchestration via anyio.run and propagates the exit code to sys.exit, ensuring proper process termination with timeout (1), cancellation (2), or success (0) codes.

@michalskrivanek michalskrivanek force-pushed the lease-cleanup branch 2 times, most recently from c2f233a to 44038b4 Compare October 1, 2025 07:02
@michalskrivanek michalskrivanek changed the title Clean up leases on timeout and termination Lease handling user enhancements Oct 1, 2025
@michalskrivanek
Copy link
Contributor Author

@coderabbitai review

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Oct 1, 2025

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 33dc02b and 70d3175.

📒 Files selected for processing (4)
  • packages/jumpstarter-cli-common/jumpstarter_cli_common/exceptions.py (2 hunks)
  • packages/jumpstarter-cli/jumpstarter_cli/shell.py (3 hunks)
  • packages/jumpstarter/jumpstarter/client/lease.py (5 hunks)
  • packages/jumpstarter/jumpstarter/common/utils.py (1 hunks)
✅ Files skipped from review due to trivial changes (1)
  • packages/jumpstarter/jumpstarter/common/utils.py
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)
  • GitHub Check: e2e
  • GitHub Check: pytest-matrix (ubuntu-24.04, 3.12)
  • GitHub Check: pytest-matrix (ubuntu-24.04, 3.11)
  • GitHub Check: pytest-matrix (ubuntu-24.04, 3.13)

@mangelajo
Copy link
Member

(jumpstarter) ➜  jumpstarter git:(lease-cleanup) ✗ uv run jmp shell -l device=nxp-s32g-vnp-rdb3-01                                                                                                    
Installed 4 packages in 3ms
[10/01/25 08:04:19] INFO     INFO:jumpstarter.client.lease:Acquiring lease 01999ecd-02b7-7896-b6bd-fcab66555594 for selector device=nxp-s32g-vnp-rdb3-01 for duration 0:30:00                                                                lease.py:68
^CSIGINT pressed, terminating
[10/01/25 08:04:23] INFO     INFO:jumpstarter.client.lease:Releasing Lease 01999ecd-02b7-7896-b6bd-fcab66555594                                                                                                                             lease.py:158
(jumpstarter) ➜  jumpstarter git:(lease-cleanup) ✗ uv run jmp get leases
Installed 4 packages in 2ms
NAME                                  SELECTOR                                   DURATION          CLIENT           EXPORTER                       
01990c47-87ef-701e-abf1-f2db333f7ef7  board-type=qc8775,enabled=true             31 days, 0:00:00  sberg            qti-snapdragon-ride4-sa8775p-03
019981ac-5505-73de-a856-daba35ea64a4  device=ti-jacinto-j784s4xevm-04            37 days, 0:00:00  bzlotnik         ti-jacinto-j784s4xevm-04       
01999656-5308-7c31-8c7f-edef76ad50b1  board-type=nxp-s32g-vnp-rdb3,enabled=true  2 days, 0:00:00   rrendec          nxp-s32g-vnp-rdb3-02           
01999aaf-2d2e-73e0-b5e3-b22c55ef621d  device=renesas-rcar-s4-01                  37 days, 0:00:00  test-console-ci  renesas-rcar-s4-01             
01999ba1-1f2d-7aa9-8b30-a58490538657  board-type=j784s4evm,enabled=true          1 day, 0:00:00    sbertram         ti-jacinto-j784s4xevm-09       
01999be1-d92d-77b8-9234-5326d02ce7f2  board-type=nxp-s32g-vnp-rdb3,enabled=true  1 day, 0:00:00    sbertram         nxp-s32g-vnp-rdb3-01           
01999ecc-cf00-7a8b-b68f-d0d4f6c29e56  device=ti-jacinto-j784s4xevm-22            0:30:00           mskrivan         ti-jacinto-j784s4xevm-22       
(jumpstarter) ➜  jumpstarter git:(lease-cleanup) ✗ uv run jmp shell -l device=nxp-s32g-vnp-rdb3-01-doesntexist
Installed 4 packages in 2ms
[10/01/25 08:04:46] INFO     INFO:jumpstarter.client.lease:Acquiring lease 01999ecd-69da-7aa7-a3d8-407859f35b07 for selector device=nxp-s32g-vnp-rdb3-01-doesntexist for duration 0:30:00                                                    lease.py:68
                    INFO     INFO:jumpstarter.client.lease:Releasing Lease 01999ecd-69da-7aa7-a3d8-407859f35b07                                                                                                                             lease.py:158
Error: the lease cannot be satisfied: There are no online exporter matching the selector, but there are 0 matching offline exporters

Manual verification of your patch since it's a bit complex to add to E2E..
it looks great!

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (1)
packages/jumpstarter-cli/jumpstarter_cli/shell.py (1)

50-57: Plain TimeoutError will crash CLI; map it to exit code 1.

A timeout during lease acquisition can raise a plain TimeoutError (not in an ExceptionGroup). It bypasses current handlers and propagates. Catch it explicitly before the cancellation branch.

             except BaseExceptionGroup as eg:
                 for exc in eg.exceptions:
                     if isinstance(exc, TimeoutError):
                         raise exc from None
                 raise
+            except TimeoutError:
+                exit_code = 1
             except cancelled_exc_class:
                 exit_code = 2
🧹 Nitpick comments (6)
packages/jumpstarter/jumpstarter/config/client.py (1)

89-99: Set an explicit env prefix for nested Lease settings (consistency).

Other nested settings (e.g., Drivers) define an env prefix. Consider adding one here so users can override via env (e.g., JMP_LEASES_ACQUISITION_TIMEOUT) and to avoid accidental global env binding.

 class ClientConfigV1Alpha1Lease(BaseSettings):
     """Configuration for lease operations."""
 
+    model_config = SettingsConfigDict(env_prefix="JMP_LEASES_")
+
     acquisition_timeout: int = Field(
         default=300,
         description="Timeout in seconds for lease acquisition",
         ge=5,  # Must be at least 5 seconds (polling interval)
         le=43200,  # Maximum 12 hours
     )
packages/jumpstarter-cli/jumpstarter_cli/shell.py (1)

18-34: Optional: avoid portal by using async lease APIs and offload only the shell.

You could run serve_unix_async/monitor_async/log streaming directly in the event loop and use to_thread.run_sync just for launch_shell. Reduces cross-thread complexity.

Example sketch:

async with lease.serve_unix_async() as path, lease.monitor_async():
    if exporter_logs:
        async with lease.connect_async(ExitStack()) as client, client.log_stream_async():
            exit_code = await anyio.to_thread.run_sync(launch_remote_shell, path)
    else:
        exit_code = await anyio.to_thread.run_sync(launch_remote_shell, path)
packages/jumpstarter/jumpstarter/client/lease.py (4)

50-52: LGTM: configurable acquisition timeout and exporter name.

Good defaults and clear semantics. Minor: keep "remote" if exporter is empty.

-                        self.exporter_name = result.exporter
+                        if result.exporter:
+                            self.exporter_name = result.exporter

119-157: Robust acquisition loop with clear failure modes.

Covers Ready/Unsatisfiable/Invalid/Released and times out via fail_after. Consider adaptive sleep to reduce overshoot near timeout (nice-to-have).

-                    await sleep(5)
+                    # Sleep up to 5s but do not overshoot the remaining timeout budget
+                    await sleep(min(5, max(0.1, self.acquisition_timeout)))

166-175: Don’t let cleanup errors (other than timeout) escape.

DeleteLease errors during teardown could mask the original exception or interrupt shutdown. Log and continue after the shielded timeout.

                 with CancelScope(shield=True):
                     try:
                         with fail_after(30):
                             await self.svc.DeleteLease(
                                 name=self.name,
                             )
                     except TimeoutError:
                         logger.warning("Timeout while deleting lease %s during cleanup", self.name)
+                    except Exception as e:
+                        logger.warning("Error while deleting lease %s during cleanup: %s", self.name, e)

153-156: Minor: prefer lazy logging formatting for consistency.

Avoid eager f-strings in debug to skip formatting when disabled.

-            logger.debug(f"Lease {self.name} acquisition timed out after {self.acquisition_timeout} seconds")
+            logger.debug("Lease %s acquisition timed out after %s seconds", self.name, self.acquisition_timeout)
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 70d3175 and ec1a83b.

📒 Files selected for processing (5)
  • packages/jumpstarter-cli-common/jumpstarter_cli_common/exceptions.py (2 hunks)
  • packages/jumpstarter-cli/jumpstarter_cli/shell.py (3 hunks)
  • packages/jumpstarter/jumpstarter/client/lease.py (4 hunks)
  • packages/jumpstarter/jumpstarter/common/utils.py (1 hunks)
  • packages/jumpstarter/jumpstarter/config/client.py (3 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • packages/jumpstarter/jumpstarter/common/utils.py
🧰 Additional context used
🧬 Code graph analysis (3)
packages/jumpstarter/jumpstarter/client/lease.py (3)
packages/jumpstarter/jumpstarter/common/condition.py (4)
  • condition_true (26-27)
  • condition_message (16-23)
  • condition_false (30-31)
  • condition_present_and_equal (6-13)
packages/jumpstarter/jumpstarter/client/exceptions.py (1)
  • LeaseError (4-5)
packages/jumpstarter/jumpstarter/client/grpc.py (1)
  • DeleteLease (383-389)
packages/jumpstarter-cli/jumpstarter_cli/shell.py (4)
packages/jumpstarter-cli-common/jumpstarter_cli_common/signal.py (1)
  • signal_handler (9-22)
packages/jumpstarter/jumpstarter/config/client.py (2)
  • lease (136-144)
  • lease_async (248-285)
packages/jumpstarter/jumpstarter/common/utils.py (1)
  • launch_shell (49-127)
packages/jumpstarter/jumpstarter/client/lease.py (3)
  • serve_unix (237-239)
  • monitor (242-244)
  • connect (231-234)
packages/jumpstarter-cli-common/jumpstarter_cli_common/exceptions.py (1)
packages/jumpstarter/jumpstarter/common/exceptions.py (3)
  • JumpstarterException (4-39)
  • get_config (32-33)
  • ConnectionError (42-45)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)
  • GitHub Check: Redirect rules - jumpstarter-docs
  • GitHub Check: Header rules - jumpstarter-docs
  • GitHub Check: Pages changed - jumpstarter-docs
  • GitHub Check: e2e
🔇 Additional comments (5)
packages/jumpstarter/jumpstarter/config/client.py (2)

122-123: LGTM: new leases section on client config.

Adds a clear place for lease-related knobs; defaults are sane.


274-275: LGTM: timeout plumbed into Lease.

Timeout is taken from config and enforced by Lease. Boundaries are already validated.

packages/jumpstarter-cli/jumpstarter_cli/shell.py (3)

45-49: LGTM: BlockingPortal usage to bridge the worker thread.

Creating a portal bound to the running loop and passing it into lease_async is the right pattern for wrapping async CMs used from a worker thread.


41-43: LGTM: signal-driven cancellation via TaskGroup.

signal_handler cancels the group; finally-block ensures cleanup. Good structured concurrency.


90-93: LGTM: anyio.run entry point and exit code propagation.

Clean mapping of async result to process exit.


await sleep(5)
except TimeoutError:
logger.debug(f"Lease {self.name} acquisition timed out after {self.acquisition_timeout} seconds")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
logger.debug(f"Lease {self.name} acquisition timed out after {self.acquisition_timeout} seconds")
logger.error(f"Lease {self.name} acquisition timed out after {self.acquisition_timeout} seconds")

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it will be obvious by the exception anyway

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i was trying to compare both debug and normal output and it makes IMHO more sense like this:

$ jmp --log-level=DEBUG shell -l device=ti-jacinto-j784s4xevm-22; echo $?
[10/01/25 13:16:34] DEBUG    DEBUG:asyncio:Using selector: KqueueSelector                                                                                                                                       selector_events.py:64
                    DEBUG    DEBUG:grpc._cython.cygrpc:Using AsyncIOEngine.POLLER as I/O engine                                                                                                                       _channel.py:368
                    DEBUG    DEBUG:jumpstarter.client.lease:Creating lease request for selector device=ti-jacinto-j784s4xevm-22 for duration 0:30:00                                                                      lease.py:61
[10/01/25 13:16:35] INFO     INFO:jumpstarter.client.lease:Acquiring lease 01999f7d-066e-7538-8fff-b56da6c65d9e for selector device=ti-jacinto-j784s4xevm-22 for duration 0:30:00                                         lease.py:69
                    DEBUG    DEBUG:jumpstarter.client.lease:Polling Lease 01999f7d-066e-7538-8fff-b56da6c65d9e                                                                                                           lease.py:122
[10/01/25 13:16:40] DEBUG    DEBUG:jumpstarter.client.lease:Lease 01999f7d-066e-7538-8fff-b56da6c65d9e acquisition timed out after 5 seconds                                                                             lease.py:153
                    INFO     INFO:jumpstarter.client.lease:Releasing Lease 01999f7d-066e-7538-8fff-b56da6c65d9e                                                                                                          lease.py:165
Error: lease 01999f7d-066e-7538-8fff-b56da6c65d9e acquisition timed out after 5 seconds
1

and without debug:
$ jmp shell -l device=ti-jacinto-j784s4xevm-22; echo $?
[10/01/25 13:17:25] INFO     INFO:jumpstarter.client.lease:Acquiring lease 01999f7d-ca41-78d7-a75f-ea5b15af5f6b for selector device=ti-jacinto-j784s4xevm-22 for duration 0:30:00                                         lease.py:69
[10/01/25 13:17:30] INFO     INFO:jumpstarter.client.lease:Releasing Lease 01999f7d-ca41-78d7-a75f-ea5b15af5f6b                                                                                                          lease.py:165
Error: lease 01999f7d-ca41-78d7-a75f-ea5b15af5f6b acquisition timed out after 5 seconds
1

where the "Error: lease...." is red. I.e. the end-user error message is already there, just through the LeaseError handler

Args:
host: The jumpstarter host path
context: The context of the shell ("local" or "remote")
context: The context of the shell (e.g. "local" or exporter name)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Has this really haned?

I think it.s local/remote

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

but it could make sense to set the exporter name instead of remote :D that would be cool

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This has been ideated by AI! 🌊

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so that's all contained in fairly small jumpstarter-dev/jumpstarter@62d1c9e
using the exporter name from when the lease becomes ready, propagated through so we do not have to fetch it again.

or do you mean anything else?

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between ec1a83b and 86d8885.

⛔ Files ignored due to path filters (1)
  • uv.lock is excluded by !**/*.lock
📒 Files selected for processing (3)
  • packages/jumpstarter-cli/jumpstarter_cli/shell.py (3 hunks)
  • packages/jumpstarter/jumpstarter/client/lease.py (4 hunks)
  • packages/jumpstarter/jumpstarter/config/client.py (3 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • packages/jumpstarter/jumpstarter/config/client.py
🧰 Additional context used
🧬 Code graph analysis (2)
packages/jumpstarter-cli/jumpstarter_cli/shell.py (4)
packages/jumpstarter-cli-common/jumpstarter_cli_common/signal.py (1)
  • signal_handler (9-22)
packages/jumpstarter/jumpstarter/config/client.py (2)
  • lease (135-143)
  • lease_async (247-284)
packages/jumpstarter/jumpstarter/common/utils.py (1)
  • launch_shell (49-127)
packages/jumpstarter/jumpstarter/client/lease.py (3)
  • serve_unix (237-239)
  • monitor (242-244)
  • connect (231-234)
packages/jumpstarter/jumpstarter/client/lease.py (3)
packages/jumpstarter/jumpstarter/common/condition.py (4)
  • condition_true (26-27)
  • condition_message (16-23)
  • condition_false (30-31)
  • condition_present_and_equal (6-13)
packages/jumpstarter/jumpstarter/client/exceptions.py (1)
  • LeaseError (4-5)
packages/jumpstarter/jumpstarter/client/grpc.py (1)
  • DeleteLease (383-389)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)
  • GitHub Check: Redirect rules - jumpstarter-docs
  • GitHub Check: Header rules - jumpstarter-docs
  • GitHub Check: Pages changed - jumpstarter-docs
  • GitHub Check: e2e

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (2)
packages/jumpstarter/jumpstarter/client/lease.py (1)

164-174: Guard cleanup when no lease name was allocated.

The cleanup unconditionally attempts to release the lease even when self.name is None (e.g., if lease creation fails or is cancelled before name assignment). This will log "Releasing Lease None" and call DeleteLease(name=None), potentially masking the original error.

Based on past review comments, apply this guard to skip cleanup when no lease was created:

         if self.release:
-            logger.info("Releasing Lease %s", self.name)
-            # Shield cleanup from cancellation to ensure it completes
-            with CancelScope(shield=True):
-                try:
-                    with fail_after(30):
-                        await self.svc.DeleteLease(
-                            name=self.name,
-                        )
-                except TimeoutError:
-                    logger.warning("Timeout while deleting lease %s during cleanup", self.name)
+            if not self.name:
+                logger.debug("Skipping lease cleanup; no lease was created")
+            else:
+                logger.info("Releasing Lease %s", self.name)
+                # Shield cleanup from cancellation to ensure it completes
+                with CancelScope(shield=True):
+                    try:
+                        with fail_after(30):
+                            await self.svc.DeleteLease(name=self.name)
+                    except TimeoutError:
+                        logger.warning("Timeout while deleting lease %s during cleanup", self.name)
packages/jumpstarter-cli-common/jumpstarter_cli_common/exceptions.py (1)

87-87: Good fix: re‑raise the original ExceptionGroup.

This addresses the earlier “bare raise outside except” concern noted in a past review.

🧹 Nitpick comments (5)
pyproject.toml (1)

27-30: Keep workspace sources alphabetized

Please keep the driver list alphabetized to ease scanning. Swapping the two entries restores the existing ordering convention.

 jumpstarter-driver-tasmota = { workspace = true }
-jumpstarter-driver-tmt = { workspace = true }
 jumpstarter-driver-tftp = { workspace = true }
+jumpstarter-driver-tmt = { workspace = true }
packages/jumpstarter-cli-common/jumpstarter_cli_common/exceptions.py (4)

22-30: Make leaf traversal non‑mutating and explicitly re‑raise the group.

You re‑raise the original group unchanged, so don’t mutate leaf tracebacks here; also be explicit with raise eg for clarity.

Apply this diff:

-        except BaseExceptionGroup as eg:
-            # Handle exceptions wrapped in ExceptionGroup (e.g., from task groups)
-            for exc in leaf_exceptions(eg):
+        except BaseExceptionGroup as eg:
+            # Handle exceptions wrapped in ExceptionGroup (e.g., from task groups)
+            for exc in leaf_exceptions(eg, fix_tracebacks=False):
                 if isinstance(exc, JumpstarterException):
                     raise ClickExceptionRed(str(exc)) from None
                 elif isinstance(exc, click.ClickException):
                     raise exc from None
             # If no handled exceptions, re-raise the original group
-            raise
+            raise eg

69-79: Drop boolean return contract; this function either raises or does nothing.

return True after _handle_connection_error_with_reauth() is unreachable (it always raises). Returning False elsewhere isn’t used reliably by the caller. Simplify to “raise or no‑op”.

-def _handle_single_exception_with_reauth(exc, login_func):
-    """Handle a single exception and return True if handled."""
+def _handle_single_exception_with_reauth(exc, login_func):
+    """Handle a single exception (may raise)."""
     if isinstance(exc, ConnectionError):
-        _handle_connection_error_with_reauth(exc, login_func)
-        return True
+        _handle_connection_error_with_reauth(exc, login_func)
     elif isinstance(exc, JumpstarterException):
         raise ClickExceptionRed(str(exc)) from None
     elif isinstance(exc, click.ClickException):
         raise exc from None
-    return False  # Not handled
+    # Not handled: fall through

81-88: ExceptionGroup handler: iterate non‑mutating leaves; remove unreachable return.

The current return is dead code, and mutating leaf tracebacks isn’t needed since you re‑raise the original group when nothing matches.

-def _handle_exception_group_with_reauth(eg, login_func):
+def _handle_exception_group_with_reauth(eg, login_func):
     """Handle exceptions wrapped in BaseExceptionGroup."""
-    for exc in leaf_exceptions(eg):
-        if _handle_single_exception_with_reauth(exc, login_func):
-            return  # Exception was handled and re-raised
+    for exc in leaf_exceptions(eg, fix_tracebacks=False):
+        _handle_single_exception_with_reauth(exc, login_func)
     # If no handled exceptions, re-raise the original group
     raise eg

Optionally annotate this as NoReturn to aid type checkers (add import at top):

from typing import NoReturn
def _handle_exception_group_with_reauth(eg, login_func) -> "NoReturn": ...

110-110: Update PEP link to PEP 654 Replace the pep-0785 URL with https://peps.python.org/pep-0654/ to reference the Exception Groups PEP.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 86d8885 and 5e50735.

⛔ Files ignored due to path filters (1)
  • uv.lock is excluded by !**/*.lock
📒 Files selected for processing (11)
  • packages/jumpstarter-all/pyproject.toml (1 hunks)
  • packages/jumpstarter-cli-common/jumpstarter_cli_common/exceptions.py (2 hunks)
  • packages/jumpstarter-cli-common/jumpstarter_cli_common/opt.py (2 hunks)
  • packages/jumpstarter-cli/jumpstarter_cli/shell.py (3 hunks)
  • packages/jumpstarter-driver-tmt/jumpstarter_driver_tmt/client.py (1 hunks)
  • packages/jumpstarter-driver-tmt/jumpstarter_driver_tmt/driver_test.py (6 hunks)
  • packages/jumpstarter/jumpstarter/client/lease.py (4 hunks)
  • packages/jumpstarter/jumpstarter/common/utils.py (1 hunks)
  • packages/jumpstarter/jumpstarter/config/client.py (3 hunks)
  • packages/jumpstarter/jumpstarter/config/client_config_test.py (3 hunks)
  • pyproject.toml (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (2)
  • packages/jumpstarter/jumpstarter/config/client.py
  • packages/jumpstarter/jumpstarter/common/utils.py
🧰 Additional context used
🧠 Learnings (1)
📚 Learning: 2025-01-29T11:52:43.554Z
Learnt from: bennyz
PR: jumpstarter-dev/jumpstarter#241
File: packages/jumpstarter-driver-tftp/jumpstarter_driver_tftp/client.py:52-60
Timestamp: 2025-01-29T11:52:43.554Z
Learning: The TFTP driver (packages/jumpstarter-driver-tftp/jumpstarter_driver_tftp/driver.py) handles all low-level concerns like path validation, error handling, and checksum computation. The client (packages/jumpstarter-driver-tftp/jumpstarter_driver_tftp/client.py) should remain simple as it delegates these responsibilities to the driver.

Applied to files:

  • packages/jumpstarter-all/pyproject.toml
🧬 Code graph analysis (3)
packages/jumpstarter-cli/jumpstarter_cli/shell.py (4)
packages/jumpstarter-cli-common/jumpstarter_cli_common/signal.py (1)
  • signal_handler (9-22)
packages/jumpstarter/jumpstarter/config/client.py (2)
  • lease (135-143)
  • lease_async (247-284)
packages/jumpstarter/jumpstarter/common/utils.py (1)
  • launch_shell (49-127)
packages/jumpstarter/jumpstarter/client/lease.py (3)
  • serve_unix (237-239)
  • monitor (242-244)
  • connect (231-234)
packages/jumpstarter/jumpstarter/client/lease.py (3)
packages/jumpstarter/jumpstarter/common/condition.py (4)
  • condition_true (26-27)
  • condition_message (16-23)
  • condition_false (30-31)
  • condition_present_and_equal (6-13)
packages/jumpstarter/jumpstarter/client/exceptions.py (1)
  • LeaseError (4-5)
packages/jumpstarter/jumpstarter/client/grpc.py (1)
  • DeleteLease (383-389)
packages/jumpstarter-cli-common/jumpstarter_cli_common/exceptions.py (1)
packages/jumpstarter/jumpstarter/common/exceptions.py (3)
  • JumpstarterException (4-39)
  • get_config (32-33)
  • ConnectionError (42-45)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (10)
  • GitHub Check: Redirect rules - jumpstarter-docs
  • GitHub Check: Header rules - jumpstarter-docs
  • GitHub Check: pytest-matrix (macos-15, 3.12)
  • GitHub Check: pytest-matrix (ubuntu-24.04, 3.13)
  • GitHub Check: pytest-matrix (ubuntu-24.04, 3.12)
  • GitHub Check: pytest-matrix (macos-15, 3.11)
  • GitHub Check: pytest-matrix (ubuntu-24.04, 3.11)
  • GitHub Check: build
  • GitHub Check: e2e
  • GitHub Check: Pages changed - jumpstarter-docs
🔇 Additional comments (11)
packages/jumpstarter-all/pyproject.toml (1)

33-34: Thanks for wiring the new driver into the bundle

The meta package now pulls in jumpstarter-driver-tmt, so consumers of jumpstarter-all automatically get the new functionality.

packages/jumpstarter-driver-tmt/jumpstarter_driver_tmt/client.py (1)

129-132: LGTM!

The addition of the --feeling-safe flag before --hard-reboot is correctly conditional on hard_reboot_cmd being truthy. This aligns with the updated test expectations.

packages/jumpstarter-driver-tmt/jumpstarter_driver_tmt/driver_test.py (1)

275-275: LGTM!

Test expectations correctly updated to include the new --feeling-safe flag before --hard-reboot across all provision and run command test cases. The updates are consistent and align with the implementation changes.

Also applies to: 288-288, 304-305, 318-318, 331-331, 356-356

packages/jumpstarter/jumpstarter/config/client_config_test.py (1)

217-218: LGTM!

Test expectations correctly updated to include the new leases.acquisition_timeout field with default value 7200 in the serialized YAML output. The updates are consistent across all three test cases.

Also applies to: 258-259, 295-296

packages/jumpstarter-cli-common/jumpstarter_cli_common/opt.py (1)

2-2: Pragmatic argv-based command detection for logging configuration.

The sys.argv-based detection to conditionally disable RichHandler for the "run" command is a reasonable workaround given that ctx/params are not available at callback time. The comment clearly documents the constraint.

Also applies to: 13-19

packages/jumpstarter/jumpstarter/client/lease.py (4)

12-19: LGTM!

The updated anyio imports include the necessary utilities (AsyncContextManagerMixin, CancelScope, fail_after, etc.) for the enhanced lease orchestration and timeout handling.


50-51: LGTM!

The new fields are well-designed:

  • acquisition_timeout makes the timeout configurable with a reasonable 300s default
  • exporter_name is correctly marked with init=False since it's populated during acquisition

69-69: LGTM!

The updated log message "Acquiring lease" is more accurate than "Created lease request" and better reflects the actual operation.


119-156: LGTM!

The reworked acquisition loop introduces several improvements:

  • Configurable timeout via self.acquisition_timeout with clear error messaging
  • Increased polling interval to 5 seconds reduces server load
  • Sets exporter_name on successful acquisition
  • Handles the "Invalid" condition explicitly
  • Clean conversion of TimeoutError to LeaseError
packages/jumpstarter-cli-common/jumpstarter_cli_common/exceptions.py (2)

16-16: Docstring update looks good.

Accurately reflects ExceptionGroup support in the async decorator.


58-66: Narrow expired token detection and emit to stderr
Bare matching on “expired” risks misclassifying certificate-or other expiry errors and prints to stdout. Restrict to both “token” and “expired” in the message and use stderr:

--- a/packages/jumpstarter-cli-common/jumpstarter_cli_common/exceptions.py:58
+++ b/packages/jumpstarter-cli-common/jumpstarter_cli_common/exceptions.py:58
 def _handle_connection_error_with_reauth(exc, login_func):
     """Handle ConnectionError with reauthentication logic."""
-    if "expired" in str(exc).lower():
-        click.echo(click.style("Token is expired, triggering re-authentication", fg="red"))
+    msg = str(exc).lower()
+    if "token" in msg and "expired" in msg:
+        click.secho("Token is expired, triggering re-authentication", fg="red", err=True)
     config = exc.get_config()
     login_func(config)
     raise ClickExceptionRed("Please try again now") from None

Please confirm that all token-expiry errors include both “token” and “expired”; if not, add a dedicated TokenExpired exception and key off its type.

When lease is not available we try for 300s to acquire it. Users can also interrupt this with Ctrl+C while waiting. Let's remove such lease requests since after termination the shell is gone and it leaves behind a mostly useless lease.
should also result in immediate error, there's no chance the lease will ever be satisfied
when trying to acquire a lease it'suseful to know the lease id.
Lease changes are not so frequent (and not that fast either), let's poll only once in 5s to save bandwidth a bit
it's useful to know where the user is connecting to instead of just "remote". Especially for lease selectors that can be satisfied by multiple exporters.
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
packages/jumpstarter-cli-common/jumpstarter_cli_common/exceptions.py (1)

162-165: Bug: types.TracebackType(...) called with keyword args — raises TypeError at runtime

types.TracebackType is positional‑only; using keyword arguments will fail (“takes no keyword arguments”). Switch to positional arguments.

-    for frame, lasti, lineno in reversed(frames):
-        new_tb = types.TracebackType(tb_next=new_tb, tb_frame=frame, tb_lasti=lasti, tb_lineno=lineno)
+    for frame, lasti, lineno in reversed(frames):
+        new_tb = types.TracebackType(new_tb, frame, lasti, lineno)
♻️ Duplicate comments (1)
packages/jumpstarter-cli-common/jumpstarter_cli_common/exceptions.py (1)

58-67: Make re‑auth robust if login_func is async; also guard config availability

If callers pass an async relogin function, login_func(config) won’t run. Detect and run coroutine functions/objects (e.g., via asyncio.run(...)), and handle config=None gracefully.

Proposed minimal patch:

+import inspect
+import asyncio
@@
 def _handle_connection_error_with_reauth(exc, login_func):
@@
-        config = exc.get_config()
-        login_func(config)
+        config = getattr(exc, "get_config", lambda: None)()  # may be None
+        result = login_func(config) if config is not None else login_func()
+        # If the login func is async or returned a coroutine, run it
+        if inspect.iscoroutinefunction(login_func):
+            asyncio.run(login_func(config) if config is not None else login_func())
+        elif inspect.iscoroutine(result):
+            asyncio.run(result)
         raise ClickExceptionRed("Please try again now") from None

To confirm whether any async login_func is passed today, please run:

#!/bin/bash
rg -nP -C2 'handle_exceptions_with_reauthentication\s*\('
rg -nP -C2 'async\s+def\s+(relogin|login)\w*\(|def\s+(relogin|login)\w*\('
🧹 Nitpick comments (7)
packages/jumpstarter/jumpstarter/client/lease.py (1)

119-156: Well-structured acquisition with comprehensive condition handling.

The timeout enforcement via fail_after and the polling loop correctly handle all lease states (Ready, Unsatisfiable, Invalid, Released). The 5-second polling interval is appropriate for reducing server load. Converting TimeoutError to LeaseError with a clear message aligns with user expectations.

Minor spacing nit: Line 138 has an extra space after the opening parenthesis in logger.debug(.

Apply this diff to fix the spacing:

-                        logger.debug( "Lease %s is invalid: %s", self.name, message)
+                        logger.debug("Lease %s is invalid: %s", self.name, message)
packages/jumpstarter-cli-common/jumpstarter_cli_common/exceptions.py (4)

22-30: Consider using except* or aggregating multiple leaves

Looping leaves raises on the first handled leaf, potentially masking other errors in the group. Two options:

  • Use except* JumpstarterException / except* click.ClickException to let Python split groups automatically.
  • Or aggregate handled leaf messages and present a concise summary before re‑raising the group when unhandled.

69-79: Unreachable return True; clarify contract

For handled cases you always raise; the “return True if handled” path is unreachable. Either (a) remove boolean returns and document that the function raises on handled exceptions, or (b) change the control flow to actually return without raising.

Minimal clean‑up:

 def _handle_single_exception_with_reauth(exc, login_func):
-    """Handle a single exception and return True if handled."""
+    """Handle a single exception; raises on handled cases, returns False otherwise."""
     if isinstance(exc, ConnectionError):
         _handle_connection_error_with_reauth(exc, login_func)
-        return True
     elif isinstance(exc, JumpstarterException):
         raise ClickExceptionRed(str(exc)) from None
     elif isinstance(exc, click.ClickException):
         raise exc from None
     return False  # Not handled

81-88: This handler never returns; type and comment it as such and drop dead return

Make non‑returning intent explicit and remove the dead path.

+from typing import NoReturn
@@
-def _handle_exception_group_with_reauth(eg, login_func):
-    """Handle exceptions wrapped in BaseExceptionGroup."""
-    for exc in leaf_exceptions(eg):
-        if _handle_single_exception_with_reauth(exc, login_func):
-            return  # Exception was handled and re-raised
-    # If no handled exceptions, re-raise the original group
-    raise eg
+def _handle_exception_group_with_reauth(eg, login_func) -> NoReturn:
+    """Handle exceptions wrapped in BaseExceptionGroup.
+    Note: this function never returns; it always raises."""
+    for exc in leaf_exceptions(eg):
+        # Will raise if handled; returns False otherwise
+        _handle_single_exception_with_reauth(exc, login_func)
+    raise eg  # nothing was handled

98-101: Non‑returning helpers: add a brief note so readers know control flow stops here

Since both helpers raise on handled cases, add a short inline comment or typing.NoReturn annotation to avoid confusion for future readers and type checkers.

packages/jumpstarter-cli/jumpstarter_cli/shell.py (2)

50-54: Drop the ExceptionGroup special-casing (now likely dead code).

With timeout handling centralized in lease acquisition (raising LeaseError), peeling out TimeoutError from BaseExceptionGroup is no longer needed and slightly obscures error flow. Let exceptions bubble to the decorator uniformly.

Apply this diff:

-            except BaseExceptionGroup as eg:
-                for exc in eg.exceptions:
-                    if isinstance(exc, TimeoutError):
-                        raise exc from None
-                raise

18-34: Minor: reduce duplication around launching the shell.

Small readability tweak to avoid duplicate returns.

 def _run_shell_with_lease(lease, exporter_logs, config, command):
     """Run shell with lease context managers."""
     def launch_remote_shell(path: str) -> int:
         return launch_shell(
             path, lease.exporter_name, config.drivers.allow, config.drivers.unsafe,
             config.shell.use_profiles, command=command
         )
 
     with lease.serve_unix() as path:
         with lease.monitor():
-            if exporter_logs:
-                with lease.connect() as client:
-                    with client.log_stream():
-                        return launch_remote_shell(path)
-            else:
-                return launch_remote_shell(path)
+            if exporter_logs:
+                with lease.connect() as client, client.log_stream():
+                    exit_code = launch_remote_shell(path)
+            else:
+                exit_code = launch_remote_shell(path)
+            return exit_code
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 5e50735 and 93174e1.

⛔ Files ignored due to path filters (1)
  • uv.lock is excluded by !**/*.lock
📒 Files selected for processing (6)
  • packages/jumpstarter-cli-common/jumpstarter_cli_common/exceptions.py (2 hunks)
  • packages/jumpstarter-cli/jumpstarter_cli/shell.py (3 hunks)
  • packages/jumpstarter/jumpstarter/client/lease.py (4 hunks)
  • packages/jumpstarter/jumpstarter/common/utils.py (1 hunks)
  • packages/jumpstarter/jumpstarter/config/client.py (3 hunks)
  • packages/jumpstarter/jumpstarter/config/client_config_test.py (3 hunks)
🚧 Files skipped from review as they are similar to previous changes (3)
  • packages/jumpstarter/jumpstarter/config/client.py
  • packages/jumpstarter/jumpstarter/config/client_config_test.py
  • packages/jumpstarter/jumpstarter/common/utils.py
🧰 Additional context used
🧬 Code graph analysis (3)
packages/jumpstarter/jumpstarter/client/lease.py (3)
packages/jumpstarter/jumpstarter/common/condition.py (4)
  • condition_true (26-27)
  • condition_message (16-23)
  • condition_false (30-31)
  • condition_present_and_equal (6-13)
packages/jumpstarter/jumpstarter/client/exceptions.py (1)
  • LeaseError (4-5)
packages/jumpstarter/jumpstarter/client/grpc.py (1)
  • DeleteLease (383-389)
packages/jumpstarter-cli-common/jumpstarter_cli_common/exceptions.py (1)
packages/jumpstarter/jumpstarter/common/exceptions.py (3)
  • JumpstarterException (4-39)
  • get_config (32-33)
  • ConnectionError (42-45)
packages/jumpstarter-cli/jumpstarter_cli/shell.py (4)
packages/jumpstarter-cli-common/jumpstarter_cli_common/signal.py (1)
  • signal_handler (9-22)
packages/jumpstarter/jumpstarter/config/client.py (2)
  • lease (135-143)
  • lease_async (247-284)
packages/jumpstarter/jumpstarter/common/utils.py (1)
  • launch_shell (49-127)
packages/jumpstarter/jumpstarter/client/lease.py (3)
  • serve_unix (237-239)
  • monitor (242-244)
  • connect (231-234)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (10)
  • GitHub Check: Redirect rules - jumpstarter-docs
  • GitHub Check: Header rules - jumpstarter-docs
  • GitHub Check: Pages changed - jumpstarter-docs
  • GitHub Check: pytest-matrix (macos-15, 3.11)
  • GitHub Check: pytest-matrix (ubuntu-24.04, 3.11)
  • GitHub Check: pytest-matrix (ubuntu-24.04, 3.12)
  • GitHub Check: pytest-matrix (ubuntu-24.04, 3.13)
  • GitHub Check: pytest-matrix (macos-15, 3.12)
  • GitHub Check: pytest-matrix (macos-15, 3.13)
  • GitHub Check: e2e
🔇 Additional comments (7)
packages/jumpstarter/jumpstarter/client/lease.py (4)

12-19: LGTM! Clean anyio imports for structured concurrency.

The imported primitives (CancelScope, fail_after, sleep) are used correctly throughout the file for timeout handling and cleanup shielding.


50-51: LGTM! Well-documented fields with sensible defaults.

The acquisition_timeout default (300s) aligns with the PR objectives, and exporter_name correctly uses init=False since it's populated during acquisition at line 127.


69-69: LGTM! More accurate log message.

"Acquiring lease" better reflects the state—the lease request is created but not yet ready. This aligns well with the subsequent polling loop.


164-174: Excellent fix! Cleanup now guards against missing lease name and shields from cancellation.

The addition of self.name check at line 164 directly addresses the past review concern—preventing DeleteLease(name=None) calls and misleading logs. The CancelScope(shield=True) pattern is critical for the PR objective, ensuring lease cleanup completes even when interrupted (e.g., Ctrl+C), and the 30-second timeout with warning provides robustness against hung cleanup.

packages/jumpstarter-cli-common/jumpstarter_cli_common/exceptions.py (1)

16-16: Docstring update reads well

Clearer scope for async ExceptionGroup handling. LGTM.

packages/jumpstarter-cli/jumpstarter_cli/shell.py (2)

36-61: Cancellation flow and lease cleanup look solid.

Signal-driven cancellation via task group scope, catching the AnyIO cancellation class and mapping to exit code 2, should release the lease cleanly on SIGINT/SIGTERM. LGTM.


36-61: No plain TimeoutError catches in CLI code
Search across packages/jumpstarter-cli found no direct TimeoutError handlers; remaining catches in lease.py, grpc.py, and driver/base.py reside in their respective layers and wrap timeouts as intended.

@michalskrivanek
Copy link
Contributor Author

addressed most nitpicks, let's see now....

@mangelajo mangelajo self-requested a review October 1, 2025 12:06
@mangelajo
Copy link
Member

I will test the new changes asap

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
packages/jumpstarter/jumpstarter/config/client.py (1)

56-61: Attach config on expired-token errors using a case‑insensitive check.

This decorator sets the config only if the exact phrase “token is expired” matches. Make it case‑insensitive and robust so reauth consistently works downstream.

-        except ConnectionError as e:
-            if "token is expired" in str(e):
+        except ConnectionError as e:
+            if "expired" in str(e).lower():
                 # args[0] should be self for instance methods
                 e.set_config(args[0])
             raise e
🧹 Nitpick comments (4)
packages/jumpstarter-cli-common/jumpstarter_cli_common/exceptions.py (1)

151-165: Building TracebackType objects is implementation‑specific; add a safe fallback.

Directly instantiating types.TracebackType can be fragile across Python implementations. Recommend guarding construction and falling back to the original tb chain if creation fails.

-    # Add frames from tb1 to the beginning (in reverse order)
-    for frame, lasti, lineno in reversed(frames):
-        new_tb = types.TracebackType(tb_next=new_tb, tb_frame=frame, tb_lasti=lasti, tb_lineno=lineno)
+    # Add frames from tb1 to the beginning (in reverse order)
+    for frame, lasti, lineno in reversed(frames):
+        try:
+            new_tb = types.TracebackType(tb_next=new_tb, tb_frame=frame, tb_lasti=lasti, tb_lineno=lineno)
+        except Exception:
+            # Fallback: if constructing a synthetic traceback is not supported, return tb2 as-is
+            return tb2
packages/jumpstarter/jumpstarter/config/client.py (1)

89-97: Enable env overrides for leases config (consistent with drivers.*).

Drivers has JMP_DRIVERS_ env prefix; consider the same for leases so users can do JMP_LEASES_ACQUISITION_TIMEOUT=… without editing YAML.

-class ClientConfigV1Alpha1Lease(BaseSettings):
+class ClientConfigV1Alpha1Lease(BaseSettings):
+    model_config = SettingsConfigDict(env_prefix="JMP_LEASES_")
packages/jumpstarter-cli/jumpstarter_cli/shell.py (1)

47-50: Make the worker thread cancellable.

Ensure Ctrl+C (task-group cancel) promptly aborts the blocking shell runner.

-                        exit_code = await anyio.to_thread.run_sync(
-                            _run_shell_with_lease, lease, exporter_logs, config, command
-                        )
+                        exit_code = await anyio.to_thread.run_sync(
+                            _run_shell_with_lease, lease, exporter_logs, config, command,
+                            cancellable=True
+                        )
packages/jumpstarter/jumpstarter/client/lease.py (1)

138-138: Minor formatting: extra space after logger.debug(.

There's an extra space between logger.debug( and the opening quote.

Apply this diff to fix the formatting:

-                        logger.debug( "Lease %s is invalid: %s", self.name, message)
+                        logger.debug("Lease %s is invalid: %s", self.name, message)
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 93174e1 and 32d728b.

⛔ Files ignored due to path filters (1)
  • uv.lock is excluded by !**/*.lock
📒 Files selected for processing (6)
  • packages/jumpstarter-cli-common/jumpstarter_cli_common/exceptions.py (4 hunks)
  • packages/jumpstarter-cli/jumpstarter_cli/shell.py (3 hunks)
  • packages/jumpstarter/jumpstarter/client/lease.py (4 hunks)
  • packages/jumpstarter/jumpstarter/common/utils.py (1 hunks)
  • packages/jumpstarter/jumpstarter/config/client.py (3 hunks)
  • packages/jumpstarter/jumpstarter/config/client_config_test.py (3 hunks)
🚧 Files skipped from review as they are similar to previous changes (2)
  • packages/jumpstarter/jumpstarter/common/utils.py
  • packages/jumpstarter/jumpstarter/config/client_config_test.py
🧰 Additional context used
🧬 Code graph analysis (2)
packages/jumpstarter-cli-common/jumpstarter_cli_common/exceptions.py (1)
packages/jumpstarter/jumpstarter/common/exceptions.py (3)
  • JumpstarterException (4-39)
  • get_config (32-33)
  • ConnectionError (42-45)
packages/jumpstarter/jumpstarter/client/lease.py (3)
packages/jumpstarter/jumpstarter/common/condition.py (4)
  • condition_true (26-27)
  • condition_message (16-23)
  • condition_false (30-31)
  • condition_present_and_equal (6-13)
packages/jumpstarter/jumpstarter/client/exceptions.py (1)
  • LeaseError (4-5)
packages/jumpstarter/jumpstarter/client/grpc.py (1)
  • DeleteLease (383-389)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (10)
  • GitHub Check: build
  • GitHub Check: pytest-matrix (macos-15, 3.11)
  • GitHub Check: pytest-matrix (ubuntu-24.04, 3.13)
  • GitHub Check: pytest-matrix (ubuntu-24.04, 3.12)
  • GitHub Check: pytest-matrix (macos-15, 3.13)
  • GitHub Check: pytest-matrix (ubuntu-24.04, 3.11)
  • GitHub Check: e2e
  • GitHub Check: Redirect rules - jumpstarter-docs
  • GitHub Check: Header rules - jumpstarter-docs
  • GitHub Check: Pages changed - jumpstarter-docs
🔇 Additional comments (10)
packages/jumpstarter-cli-common/jumpstarter_cli_common/exceptions.py (1)

23-31: Async EG handling looks solid.

Flattening EG leaves and mapping Jumpstarter/Click exceptions to CLI errors is correct; re-raising the original group preserves non-handled failures.

packages/jumpstarter-cli/jumpstarter_cli/shell.py (4)

18-34: Lease-run orchestration is clean.

The serve_unix/monitor/log_stream layering is tidy and keeps the lease context scoped around the shell execution.


41-60: Signal-driven cancellation path LGTM.

Using get_cancelled_exc_class with a task group + signal handler is correct; the final cancel ensures teardown.


90-92: Entrypoint integration looks good.

anyio.run wrapper cleanly isolates async flow; exit_code propagation is clear.


50-55: TimeoutError propagation: no plain TimeoutError escapes
Search confirms the only plain TimeoutError in the CLI package is in this snippet and all upstream timeouts are converted to custom exceptions.

packages/jumpstarter/jumpstarter/client/lease.py (5)

12-19: LGTM! Clean anyio imports for structured concurrency.

The imports correctly bring in the necessary anyio primitives for timeout enforcement (fail_after), cancellation shielding (CancelScope), and async sleep. This aligns with anyio's structured concurrency model.


50-51: LGTM! Well-structured field additions.

The acquisition_timeout field with a 5-minute default provides sensible behavior for lease polling, and exporter_name correctly uses init=False since it's populated during the acquisition phase.


69-69: LGTM! Clearer progress messaging.

The updated log message "Acquiring lease" better reflects the active acquisition process compared to the previous "Created lease request" wording.


119-156: LGTM! Robust acquisition logic with proper timeout handling.

The refactored _acquire() method correctly:

  • Enforces the configured timeout using fail_after()
  • Polls at a reasonable 5-second interval to reduce server load
  • Handles all lease condition states (Ready, Unsatisfiable, Invalid, Pending, Released)
  • Converts timeout to a user-friendly LeaseError with clear messaging
  • Populates exporter_name on successful acquisition

The use of from None to suppress exception chaining is appropriate here since the timeout is being translated into a domain-specific exception.


164-174: LGTM! Robust cleanup with proper cancellation shielding.

The cleanup logic correctly:

  • Guards deletion by checking self.name exists (addressing the previous review concern)
  • Uses CancelScope(shield=True) to ensure cleanup completes even if the parent scope is cancelled
  • Enforces a 30-second timeout on cleanup to prevent hanging
  • Logs a warning (rather than raising) if cleanup times out, which is appropriate for best-effort cleanup

This follows anyio structured concurrency best practices for resource cleanup.

Comment on lines +59 to +67
def _handle_connection_error_with_reauth(exc, login_func):
"""Handle ConnectionError with reauthentication logic."""
if "expired" in str(exc).lower():
click.echo(click.style("Token is expired, triggering re-authentication", fg="red"))
config = exc.get_config()
login_func(config)
raise ClickExceptionRed("Please try again now") from None
else:
raise ClickExceptionRed(str(exc)) from None
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Reauth can fail if config wasn’t attached; harden and align token check.

If get_config() returns None (e.g., upstream didn’t tag the exception), login_func(config) may fail. Also, this block checks "expired" case-insensitively, while the upstream decorator checks "token is expired" case-sensitively, risking missed tagging.

Suggested changes:

  • Gracefully handle missing config (fallback message).
  • Keep the case-insensitive check consistent across layers.
 def _handle_connection_error_with_reauth(exc, login_func):
-    if "expired" in str(exc).lower():
+    if "expired" in str(exc).lower():
         click.echo(click.style("Token is expired, triggering re-authentication", fg="red"))
-        config = exc.get_config()
-        login_func(config)
+        config = exc.get_config()
+        if config is None:
+            # No config attached; instruct user to login explicitly.
+            raise ClickExceptionRed("Authentication expired. Please run 'jmp login' and retry.") from None
+        login_func(config)
         raise ClickExceptionRed("Please try again now") from None
     else:
         raise ClickExceptionRed(str(exc)) from None

Pair with the client-side change below to ensure config is always attached.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
def _handle_connection_error_with_reauth(exc, login_func):
"""Handle ConnectionError with reauthentication logic."""
if "expired" in str(exc).lower():
click.echo(click.style("Token is expired, triggering re-authentication", fg="red"))
config = exc.get_config()
login_func(config)
raise ClickExceptionRed("Please try again now") from None
else:
raise ClickExceptionRed(str(exc)) from None
def _handle_connection_error_with_reauth(exc, login_func):
"""Handle ConnectionError with reauthentication logic."""
if "expired" in str(exc).lower():
click.echo(click.style("Token is expired, triggering re-authentication", fg="red"))
config = exc.get_config()
if config is None:
# No config attached; instruct user to login explicitly.
raise ClickExceptionRed("Authentication expired. Please run 'jmp login' and retry.") from None
login_func(config)
raise ClickExceptionRed("Please try again now") from None
else:
raise ClickExceptionRed(str(exc)) from None
🤖 Prompt for AI Agents
In packages/jumpstarter-cli-common/jumpstarter_cli_common/exceptions.py around
lines 59 to 67, the reauth path assumes exc.get_config() always returns a config
and uses a loose "expired" check; update it to check for the upstream message
case-insensitively by testing if "token is expired" appears in str(exc).lower(),
then call exc.get_config() and if config is None avoid calling
login_func(config) directly — instead print a clear fallback error message and
raise ClickExceptionRed asking the user to reauth manually; if config exists,
invoke login_func(config) inside a try/except and on any failure raise
ClickExceptionRed with the caught error message so reauth failures are surfaced
cleanly.

Comment on lines 271 to 274
tls_config=self.tls,
grpc_options=self.grpcOptions,
acquisition_timeout=self.leases.acquisition_timeout,
) as lease:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

Confirm Lease(...) supports acquisition_timeout.

Wiring looks right; verify the Lease constructor accepts acquisition_timeout and that units (seconds) match its API.


🏁 Script executed:

#!/bin/bash
# Verify Lease.__init__ signature and usages for acquisition_timeout (seconds)
rg -nC2 -P 'class\s+Lease\b' packages
rg -nC3 -P '__init__\s*\(.*acquisition_timeout' packages
rg -nC2 -P 'acquisition_timeout\s*=' packages

Length of output: 1187


Fix parameter mismatch: Lease.__init__ only accepts channel and duration, not acquisition_timeout. Either pass the timeout as duration=self.leases.acquisition_timeout (wrapped in a timedelta) or extend the Lease constructor to accept an acquisition_timeout parameter.

🤖 Prompt for AI Agents
In packages/jumpstarter/jumpstarter/config/client.py around lines 271 to 274,
the code passes acquisition_timeout to Lease.__init__ but Lease only accepts
channel and duration; replace the acquisition_timeout kwarg by passing
duration=timedelta(seconds=self.leases.acquisition_timeout) (or appropriate
units) when constructing Lease, or alternatively modify the Lease class
signature to accept acquisition_timeout and map it internally to duration;
ensure imports include datetime.timedelta if using the timedelta approach and
remove the unsupported acquisition_timeout argument from the call.

increase the default time we want for lease to become availble to 2h.
Add new client config for timeout on lease acquisition:
leases:
  acquisition_timeout: 7200

Also print a clear error message when timeout happens.
@mangelajo
Copy link
Member

Verified including config changes.

@mangelajo mangelajo merged commit 9155fe6 into jumpstarter-dev:main Oct 1, 2025
18 checks passed
@jumpstarter-backport-bot
Copy link

Successfully created backport PR for release-0.7:

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

When a shell is killed while waiting for a lease (CTRL+C) the lease remains jmp shell / pytest lease handling

2 participants