feat: Implement graceful shutdown for tool and sidecar containers #1729

jaseemjaskp · 2026-01-03T02:02:27Z

Summary

Implements proper graceful shutdown handling for both the tool container and sidecar container when running in Docker/Kubernetes.

Problem

Both signal handlers in entrypoint.py and log_processor.py only logged signals but didn't actually stop execution or clean up resources. When Docker/Kubernetes sends SIGTERM, the containers continued running until forcibly killed.

Changes

1. ToolExecutionTracker (`unstract/core/src/unstract/core/tool_execution_status.py`)

Added cleanup() method to close Redis connection

2. LogPublisher (`unstract/core/src/unstract/core/pubsub_helper.py`)

Added cleanup() class method to close Kombu and Redis connections

3. LogProcessor (`tool-sidecar/src/unstract/tool_sidecar/log_processor.py`)

Added _shutdown_requested class-level flag
Signal handler now sets the shutdown flag
monitor_logs() checks the flag and exits gracefully, updating tool execution status to FAILED
Added cleanup() method to close all connections
main() now calls processor.cleanup() in a finally block

4. ToolEntrypoint (`unstract/sdk1/src/unstract/sdk1/tool/entrypoint.py`)

Signal handler now raises SystemExit(128 + signum) for proper exit with standard signal exit code

Why Both Handlers Are Needed

Both signal handlers are required because they run in separate containers within the same Kubernetes pod:

Tool container (entrypoint.py): Runs the actual tool code
Sidecar container (log_processor.py): Monitors tool logs and publishes to Redis

When Kubernetes terminates a pod, both containers receive SIGTERM independently.

Testing

Manual testing: Send docker kill --signal=SIGTERM to containers and verify graceful shutdown
Verify logs show "Shutdown requested, exiting monitor loop"
Verify tool execution status is updated to FAILED on signal
Verify Redis connections are closed cleanly

- Add cleanup() method to ToolExecutionTracker to close Redis connection - Add cleanup() class method to LogPublisher to close Kombu and Redis connections - Update LogProcessor signal handler to set shutdown flag - Add shutdown flag check in monitor_logs() loop to exit gracefully - Add cleanup() method to LogProcessor and call it in finally block - Update ToolEntrypoint signal handler to raise SystemExit with proper exit code Both containers now properly handle SIGTERM/SIGINT signals from Docker/Kubernetes, update tool execution status, and clean up Redis connections before exiting.

coderabbitai · 2026-01-03T02:02:39Z

Summary by CodeRabbit

Bug Fixes
- Enhanced signal handling for graceful process termination
- Improved resource cleanup on shutdown to prevent connection leaks
- Added proper exit code handling for signal-terminated processes

_{✏️ Tip: You can customize this high-level summary in your review settings.}

Walkthrough

These changes implement graceful shutdown handling across multiple components. A global shutdown signal mechanism is introduced in LogProcessor that monitors for termination requests and triggers cleanup in LogPublisher and ToolExecutionTracker before exit. The signal handler now exits with a signal-based status code.

Changes

Cohort / File(s)	Summary
Shutdown signal mechanism and lifecycle management `tool-sidecar/src/unstract/tool_sidecar/log_processor.py`	Adds class-level `_shutdown_requested` flag; enhances `monitor_logs()` to periodically check for shutdown signals; adds `cleanup()` method to close ToolExecutionTracker and LogPublisher; wraps `monitor_logs()` with try/finally in main to ensure cleanup invocation. Updates tool execution status to FAILED with error message upon shutdown.
Resource cleanup in shared dependencies `unstract/core/src/unstract/core/pubsub_helper.py`, `unstract/core/src/unstract/core/tool_execution_status.py`	Adds `cleanup()` classmethod to LogPublisher to gracefully close Kombu connection and Redis client with debug and error logging; adds `cleanup()` method to ToolExecutionTracker to close Redis connection with exception handling.
Signal handler exit behavior `unstract/sdk1/src/unstract/sdk1/tool/entrypoint.py`	Modifies signal handler to raise `SystemExit` with exit code `128 + signum` instead of graceful return, changing process termination to use signal-based status codes.

Sequence Diagram

sequenceDiagram
    participant OS as Operating System
    participant Handler as Signal Handler
    participant LP as LogProcessor
    participant Tracker as ToolExecutionTracker
    participant Pub as LogPublisher
    participant Redis as Redis/Kombu

    OS->>Handler: Signal received (SIGTERM/SIGINT)
    Handler->>LP: Set _shutdown_requested = True
    Handler->>Handler: raise SystemExit(128 + signum)
    
    rect rgba(200, 150, 255, 0.2)
    Note over LP: monitor_logs() detects shutdown
    LP->>LP: Check _shutdown_requested
    LP->>Tracker: Update status to FAILED<br/>Error: "Process terminated by signal"
    end
    
    rect rgba(100, 200, 255, 0.2)
    Note over LP: cleanup() invoked (try/finally)
    LP->>Tracker: cleanup()
    Tracker->>Redis: Close connection
    Redis-->>Tracker: Closed
    LP->>Pub: cleanup()
    Pub->>Redis: Release Kombu connection
    Pub->>Redis: Close Redis client
    Redis-->>Pub: Closed
    end
    
    LP->>LP: Exit monitor loop
    Handler->>OS: Exit with signal code

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Pre-merge checks and finishing touches

✅ Passed checks (3 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title accurately describes the main change: implementing graceful shutdown for both tool and sidecar containers, which is the primary objective of this pull request.
Description check	✅ Passed	The description is comprehensive and covers all critical sections including What, Why, and How, with detailed explanations of changes and testing approach.
Docstring Coverage	✅ Passed	Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

✨ Finishing touches

📝 Generate docstrings

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

github-actions · 2026-01-03T02:03:17Z

Test Results

Summary

✅ Runner Tests: 11 passed, 0 failed (11 total)
✅ SDK1 Tests: 66 passed, 0 failed (66 total)

Runner Tests - Full Report

filepath	function	$$\textcolor{#23d18b}{\tt{passed}}$$	SUBTOTAL
$$\textcolor{#23d18b}{\tt{runner/src/unstract/runner/clients/test\_docker.py}}$$	$$\textcolor{#23d18b}{\tt{test\_logs}}$$	$$\textcolor{#23d18b}{\tt{1}}$$	$$\textcolor{#23d18b}{\tt{1}}$$
$$\textcolor{#23d18b}{\tt{runner/src/unstract/runner/clients/test\_docker.py}}$$	$$\textcolor{#23d18b}{\tt{test\_cleanup}}$$	$$\textcolor{#23d18b}{\tt{1}}$$	$$\textcolor{#23d18b}{\tt{1}}$$
$$\textcolor{#23d18b}{\tt{runner/src/unstract/runner/clients/test\_docker.py}}$$	$$\textcolor{#23d18b}{\tt{test\_cleanup\_skip}}$$	$$\textcolor{#23d18b}{\tt{1}}$$	$$\textcolor{#23d18b}{\tt{1}}$$
$$\textcolor{#23d18b}{\tt{runner/src/unstract/runner/clients/test\_docker.py}}$$	$$\textcolor{#23d18b}{\tt{test\_client\_init}}$$	$$\textcolor{#23d18b}{\tt{1}}$$	$$\textcolor{#23d18b}{\tt{1}}$$
$$\textcolor{#23d18b}{\tt{runner/src/unstract/runner/clients/test\_docker.py}}$$	$$\textcolor{#23d18b}{\tt{test\_get\_image\_exists}}$$	$$\textcolor{#23d18b}{\tt{1}}$$	$$\textcolor{#23d18b}{\tt{1}}$$
$$\textcolor{#23d18b}{\tt{runner/src/unstract/runner/clients/test\_docker.py}}$$	$$\textcolor{#23d18b}{\tt{test\_get\_image}}$$	$$\textcolor{#23d18b}{\tt{1}}$$	$$\textcolor{#23d18b}{\tt{1}}$$
$$\textcolor{#23d18b}{\tt{runner/src/unstract/runner/clients/test\_docker.py}}$$	$$\textcolor{#23d18b}{\tt{test\_get\_container\_run\_config}}$$	$$\textcolor{#23d18b}{\tt{1}}$$	$$\textcolor{#23d18b}{\tt{1}}$$
$$\textcolor{#23d18b}{\tt{runner/src/unstract/runner/clients/test\_docker.py}}$$	$$\textcolor{#23d18b}{\tt{test\_get\_container\_run\_config\_without\_mount}}$$	$$\textcolor{#23d18b}{\tt{1}}$$	$$\textcolor{#23d18b}{\tt{1}}$$
$$\textcolor{#23d18b}{\tt{runner/src/unstract/runner/clients/test\_docker.py}}$$	$$\textcolor{#23d18b}{\tt{test\_run\_container}}$$	$$\textcolor{#23d18b}{\tt{1}}$$	$$\textcolor{#23d18b}{\tt{1}}$$
$$\textcolor{#23d18b}{\tt{runner/src/unstract/runner/clients/test\_docker.py}}$$	$$\textcolor{#23d18b}{\tt{test\_get\_image\_for\_sidecar}}$$	$$\textcolor{#23d18b}{\tt{1}}$$	$$\textcolor{#23d18b}{\tt{1}}$$
$$\textcolor{#23d18b}{\tt{runner/src/unstract/runner/clients/test\_docker.py}}$$	$$\textcolor{#23d18b}{\tt{test\_sidecar\_container}}$$	$$\textcolor{#23d18b}{\tt{1}}$$	$$\textcolor{#23d18b}{\tt{1}}$$
$$\textcolor{#23d18b}{\tt{TOTAL}}$$		$$\textcolor{#23d18b}{\tt{11}}$$	$$\textcolor{#23d18b}{\tt{11}}$$

SDK1 Tests - Full Report

sonarqubecloud · 2026-01-03T02:03:37Z

Quality Gate passed

Issues
0 New issues
0 Accepted issues

Measures
0 Security Hotspots
0.0% Coverage on New Code
0.0% Duplication on New Code

See analysis details on SonarQube Cloud

coderabbitai

Actionable comments posted: 0

🧹 Nitpick comments (3)

unstract/core/src/unstract/core/pubsub_helper.py (1)
193-207: Cleanup implementation looks good, but checks are always truthy and consider logging.exception.

The if cls.kombu_conn and if cls.r checks on lines 197 and 203 are always truthy since these are class-level attributes initialized unconditionally at class definition (lines 23-29). The checks won't protect against uninitialized connections.

Also, per static analysis hints, consider using logging.exception instead of logging.error to include the stack trace for easier debugging.
🔎 Suggested improvement
     @classmethod
     def cleanup(cls) -> None:
         """Close publisher connections for graceful shutdown."""
         try:
-            if cls.kombu_conn:
-                cls.kombu_conn.release()
-                logging.debug("LogPublisher Kombu connection released")
+            cls.kombu_conn.release()
+            logging.debug("LogPublisher Kombu connection released")
         except Exception as e:
-            logging.error(f"Failed to close Kombu connection: {e}")
+            logging.exception(f"Failed to close Kombu connection: {e}")
         try:
-            if cls.r:
-                cls.r.close()
-                logging.debug("LogPublisher Redis connection closed")
+            cls.r.close()
+            logging.debug("LogPublisher Redis connection closed")
         except Exception as e:
-            logging.error(f"Failed to close Redis connection: {e}")
+            logging.exception(f"Failed to close Redis connection: {e}")
unstract/core/src/unstract/core/tool_execution_status.py (1)
218-225: Cleanup logic is correct; minor suggestions for consistency.

The if self.redis_client check on line 221 is always truthy since redis_client is initialized unconditionally in __init__ (line 66). Consider using logger.exception instead of logger.error to include the stack trace for debugging, consistent with the static analysis hint.
🔎 Suggested improvement
     def cleanup(self) -> None:
         """Close Redis connection for graceful shutdown."""
         try:
-            if self.redis_client:
-                self.redis_client.close()
-                logger.debug("ToolExecutionTracker Redis connection closed")
+            self.redis_client.close()
+            logger.debug("ToolExecutionTracker Redis connection closed")
         except Exception as e:
-            logger.error(f"Failed to close Redis connection: {e}")
+            logger.exception(f"Failed to close Redis connection: {e}")
tool-sidecar/src/unstract/tool_sidecar/log_processor.py (1)
267-277: Cleanup method structure is solid; consider logger.exception for stack traces.

Good design to wrap each cleanup call in its own try/except, ensuring one failure doesn't prevent the other from executing. Per static analysis hints, consider using logger.exception to include stack traces for debugging.
🔎 Suggested improvement
     def cleanup(self) -> None:
         """Clean up resources on shutdown."""
         logger.info("Cleaning up resources...")
         try:
             self.tool_execution_tracker.cleanup()
         except Exception as e:
-            logger.error(f"Failed to cleanup ToolExecutionTracker: {e}")
+            logger.exception(f"Failed to cleanup ToolExecutionTracker: {e}")
         try:
             LogPublisher.cleanup()
         except Exception as e:
-            logger.error(f"Failed to cleanup LogPublisher: {e}")
+            logger.exception(f"Failed to cleanup LogPublisher: {e}")

📜 Review details

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Cache: Disabled due to Reviews > Disable Cache setting

Knowledge base: Disabled due to Reviews -> Disable Knowledge Base setting

📥 Commits

Reviewing files that changed from the base of the PR and between 3469b02 and 9fb144b.

📒 Files selected for processing (4)

tool-sidecar/src/unstract/tool_sidecar/log_processor.py
unstract/core/src/unstract/core/pubsub_helper.py
unstract/core/src/unstract/core/tool_execution_status.py
unstract/sdk1/src/unstract/sdk1/tool/entrypoint.py

🧰 Additional context used

🧬 Code graph analysis (1)

unstract/core/src/unstract/core/pubsub_helper.py (2)

tool-sidecar/src/unstract/tool_sidecar/log_processor.py (1)

cleanup (267-277)

unstract/core/src/unstract/core/tool_execution_status.py (1)

cleanup (218-225)

🪛 Ruff (0.14.10)

unstract/core/src/unstract/core/pubsub_helper.py

200-200: Do not catch blind exception: Exception

(BLE001)

201-201: Use logging.exception instead of logging.error

Replace with exception

(TRY400)

206-206: Do not catch blind exception: Exception

(BLE001)

207-207: Use logging.exception instead of logging.error

Replace with exception

(TRY400)

tool-sidecar/src/unstract/tool_sidecar/log_processor.py

272-272: Do not catch blind exception: Exception

(BLE001)

273-273: Use logging.exception instead of logging.error

Replace with exception

(TRY400)

276-276: Do not catch blind exception: Exception

(BLE001)

277-277: Use logging.exception instead of logging.error

Replace with exception

(TRY400)

unstract/core/src/unstract/core/tool_execution_status.py

224-224: Do not catch blind exception: Exception

(BLE001)

225-225: Use logging.exception instead of logging.error

Replace with exception

(TRY400)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)

GitHub Check: build

🔇 Additional comments (4)

unstract/sdk1/src/unstract/sdk1/tool/entrypoint.py (1)

15-23: LGTM!

The signal handler correctly implements Unix convention for signal-based exit codes (128 + signal number). Raising SystemExit allows any finally blocks in the call stack to execute, enabling proper cleanup.

tool-sidecar/src/unstract/tool_sidecar/log_processor.py (3)

31-42: Signal handler and class-level flag look good.

The pattern of using a class-level flag for signal communication is appropriate for this single-process sidecar. Since Python's signal handlers run in the main thread, the flag access is safe without additional synchronization.

235-242: Graceful shutdown handling in monitor loop is well implemented.

The shutdown check placement at the start of the loop iteration ensures timely response to termination signals. Updating the execution status to FAILED with a descriptive error message provides good observability for operators investigating terminated jobs.

344-347: LGTM!

The try/finally pattern ensures cleanup() is called regardless of how monitor_logs() exits—whether through normal completion, exception, or signal-triggered break. This is the correct approach for resource cleanup.

jaseemjaskp requested review from kirtimanmishrazipstack, muhammad-ali-e and ritwik-g January 3, 2026 02:02

coderabbitai bot reviewed Jan 3, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat: Implement graceful shutdown for tool and sidecar containers #1729

feat: Implement graceful shutdown for tool and sidecar containers #1729

Uh oh!

jaseemjaskp commented Jan 3, 2026

Uh oh!

coderabbitai bot commented Jan 3, 2026 •

edited

Loading

Uh oh!

github-actions bot commented Jan 3, 2026

Uh oh!

sonarqubecloud bot commented Jan 3, 2026

Uh oh!

coderabbitai bot left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

feat: Implement graceful shutdown for tool and sidecar containers #1729

Are you sure you want to change the base?

feat: Implement graceful shutdown for tool and sidecar containers #1729

Uh oh!

Conversation

jaseemjaskp commented Jan 3, 2026

Summary

Problem

Changes

1. ToolExecutionTracker (unstract/core/src/unstract/core/tool_execution_status.py)

2. LogPublisher (unstract/core/src/unstract/core/pubsub_helper.py)

3. LogProcessor (tool-sidecar/src/unstract/tool_sidecar/log_processor.py)

4. ToolEntrypoint (unstract/sdk1/src/unstract/sdk1/tool/entrypoint.py)

Why Both Handlers Are Needed

Testing

Uh oh!

coderabbitai bot commented Jan 3, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Walkthrough

Changes

Sequence Diagram

Estimated code review effort

Pre-merge checks and finishing touches

Uh oh!

github-actions bot commented Jan 3, 2026

Test Results

Uh oh!

sonarqubecloud bot commented Jan 3, 2026

Quality Gate passed

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

1. ToolExecutionTracker (`unstract/core/src/unstract/core/tool_execution_status.py`)

2. LogPublisher (`unstract/core/src/unstract/core/pubsub_helper.py`)

3. LogProcessor (`tool-sidecar/src/unstract/tool_sidecar/log_processor.py`)

4. ToolEntrypoint (`unstract/sdk1/src/unstract/sdk1/tool/entrypoint.py`)

coderabbitai bot commented Jan 3, 2026 •

edited

Loading