Skip to content

Conversation

@VanderChen
Copy link

@VanderChen VanderChen commented Nov 17, 2025

What type of PR is this?
/kind documentation
/kind feature

What this PR does / why we need it:
This PR proposes PicoD (Pico Daemon) - a lightweight, HTTP-based service daemon that provides essential sandbox capabilities with minimal overhead while maintaining security through token-based authentication.

Which issue(s) this PR fixes:

Fix #41

@volcano-sh-bot volcano-sh-bot added kind/documentation Improvements or additions to documentation kind/feature labels Nov 17, 2025
@volcano-sh-bot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@volcano-sh-bot
Copy link
Contributor

Welcome @VanderChen! It looks like this is your first PR to volcano-sh/agentcube 🎉

@gemini-code-assist
Copy link
Contributor

Summary of Changes

Hello @VanderChen, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request introduces the comprehensive design for PicoD (Pico Daemon), a new lightweight, HTTP-based service daemon intended to replace the existing SSH-based sandbox management in AgentCube. The design focuses on providing core sandbox functionalities like code execution and file transfer through a gRPC interface, emphasizing minimal overhead, stateless operation, and robust token-based authentication. It details the architecture, security considerations, and integration with the Python SDK, aiming to improve efficiency and customization for AI agent interactions within sandboxed environments.

Highlights

  • Introduction of PicoD: This pull request introduces the design for PicoD (Pico Daemon), a new lightweight, HTTP-based (gRPC) service daemon intended to replace the existing SSH-based sandbox management in AgentCube.
  • Core Functionality: PicoD provides essential sandbox capabilities, including remote code execution via an 'Execute' RPC, and efficient file transfer mechanisms through 'WriteFile' (upload) and 'ReadFile' (download) gRPC streaming RPCs.
  • Stateless Token-Based Authentication: The design emphasizes a stateless, token-based authentication mechanism, with detailed options for securely provisioning tokens in various environments such as Kubernetes Secrets, Cloud-Init, environment variables, and instance metadata services.
  • Detailed Architecture and Design: The PR includes a comprehensive design document outlining PicoD's system architecture, gRPC server layer (using ConnectRPC), service breakdown, and security considerations, drawing inspiration from E2B's 'envd'.
  • Python SDK Integration: A PicoDClient class is defined within the Python SDK interface, providing a straightforward way for AI agents to interact with PicoD for command execution and file operations within sandboxed environments.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a comprehensive design document for PicoD, a new gRPC-based daemon for sandbox management. The design is well-structured, detailed, and covers key aspects like architecture, authentication, and API definitions. I've provided a few comments to address minor inconsistencies, clarify some implementation details in the SDK, and define a missing data type in the protobuf specification. Overall, this is an excellent design proposal.


**FilesystemService**
- `ReadFile(path) → stream bytes`: Download file content (replaces `download_file()`)
- `WriteFile(path, stream bytes) → void`: Upload file content (replaces `write_file()` and `upload_file()`)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

There's an inconsistency in the documented return type for the WriteFile RPC. Here it is listed as returning void, but the protobuf definition on line 349 specifies it returns WriteFileResponse. For clarity and consistency, please update this line to match the protobuf definition.

Suggested change
- `WriteFile(path, stream bytes) → void`: Upload file content (replaces `write_file()` and `upload_file()`)
- `WriteFile(path, stream bytes) → WriteFileResponse`: Upload file content (replaces `write_file()` and `upload_file()`)

}

message WriteFileResponse {
EntryInfo entry = 1;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The WriteFileResponse message references a type EntryInfo, which is also used in ReadFileMetadata on line 385. However, the EntryInfo message is not defined in the protobuf specifications. Please add the definition for EntryInfo to clarify what file details (e.g., path, size, permissions, timestamps) are returned.

The Python SDK provides a simple interface for interacting with PicoD:

```python
class PicoDClient:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The PicoDClient code snippet uses type hints like List, Dict, and Optional (in methods on lines 406, 412, 415, etc.), but the corresponding imports from the typing module are missing. For correctness and to provide a complete example for implementers, please add from typing import Dict, List, Optional at the beginning of this code block.

Comment on lines 406 to 407
def execute_commands(self, commands: List[str]) -> Dict[str, str]:
"""Execute multiple commands"""
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The Python SDK proposes an execute_commands method for batch execution. However, the gRPC service definition only includes a unary Execute RPC for a single command. To avoid ambiguity, please clarify if execute_commands is a client-side convenience function that calls Execute in a loop, or if a more efficient batch/streaming RPC is planned for this functionality. If it's a client-side loop, it might be worth noting the potential performance implications for a large number of commands.


### Design Goals

PicoD is designed as a **stateless daemon** that processes individual gRPC requests independently:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In L12, you said it is HTTP based

from agentcube import Sandbox

# Create a sandbox instance
sandbox = Sandbox(ttl=3600, image="python:3.11-slim")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This pr https://github.com/volcano-sh/agentcube/pull/19/files is abstracting a codeInterpreterClient class


PicoD checks token sources in the following order:
1. `--access-token` command-line flag (highest priority)
2. `PICOD_ACCESS_TOKEN` environment variable
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We cannot use env to inject token


message ExecuteRequest {
string command = 1; // Full command string to execute
optional float timeout = 2; // Execution timeout in seconds (default: 30)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we need some other attributes like setting env

}

message WriteFileResponse {
EntryInfo entry = 1;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what does EntryInfo look like

@hzxuzhonghu
Copy link
Member

cc @YaoZengzeng

#### System Architecture

```mermaid
graph TB
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The agentcube-apiserver should be shown in the architecture? Although it only performs transport forwarding.

#### Authentication Flow

```mermaid
sequenceDiagram
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto


## Motivation

The current AgentCube sandbox implementation relies on SSH (via `ssh_client.py`) for remote code execution, file transfer, and sandbox management. While SSH provides robust authentication and encryption, it introduces several challenges:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No sandbox management I think.


When PicoD runs inside a sandbox container, the access token must be securely provided at startup. Several options are available depending on the deployment environment:

##### Option 1: Kubernetes Secret Mount (Recommended for K8s)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now SDK would generate private/public key and forward public key to sandbox pod. Then this use public key to authenticate SSH connections. Is this an option?

2. `PICOD_ACCESS_TOKEN` environment variable
3. `PICOD_ACCESS_TOKEN_FILE` environment variable (reads from file)
4. `/etc/picod/token` default file location
5. Instance metadata service (if configured)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So we will support all four authentication methods mentioned above?


## Conclusion

PicoD provides a lightweight, efficient alternative to SSH for sandbox management in AgentCube. By leveraging modern HTTP/gRPC protocols and token-based authentication, it reduces resource overhead while maintaining security and functionality. The design ensures easy integration with existing AgentCube infrastructure and provides a clear migration path from the current SSH-based implementation.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
PicoD provides a lightweight, efficient alternative to SSH for sandbox management in AgentCube. By leveraging modern HTTP/gRPC protocols and token-based authentication, it reduces resource overhead while maintaining security and functionality. The design ensures easy integration with existing AgentCube infrastructure and provides a clear migration path from the current SSH-based implementation.
PicoD provides a lightweight, efficient alternative to SSH for sandbox access in AgentCube. By leveraging modern HTTP/gRPC protocols and token-based authentication, it reduces resource overhead while maintaining security and functionality. The design ensures easy integration with existing AgentCube infrastructure and provides a clear migration path from the current SSH-based implementation.

- **Secure**: Token-based authentication, eliminating the need for preconfigured users or SSH keys.
- **No Lifecycle Management**: Sandbox lifecycle (creation, deletion, monitoring) remains the responsibility of the AgentCube control plane. PicoD focuses solely on request handling.
- **Single-Request Processing**: Each API call (Execute, ReadFile, WriteFile) is handled independently, without shared state.
- **No Session Management**: No persistent connections or session tracking; every request is authenticated via metadata.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What exactly is this metadata?

**File Access Control**

- Path sanitization prevents directory traversal
- Restricted to sandbox workspace only
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What specific technique does PicoD use to restrict the workspace?

**Upload File**:

- **Endpoint**: `POST /api/files`
- **Option 1: Multipart Form Data** (recommended for binary files)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is the multipart form data recommended?


```

- **Option 2: JSON with Base64** (for text files or API convenience)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does it handle the creation of parent directories automatically during request if not exist?


#### 1. HTTP Server Layer (Go Implementation)

- **Framework**: Gin (lightweight HTTP web framework)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What were the specific factors to choose Go and the Gin over Python(Flask) or other frameworks?

@@ -0,0 +1,652 @@
# PicoD Design Document
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As we come to an aggrement this is evald now?

- **No Lifecycle Management**: Sandbox lifecycle (creation, deletion, monitoring) remains the responsibility of the AgentCube control plane. PicoD focuses solely on request handling.
- **Single-Request Processing**: Each API call (Execute, ReadFile, WriteFile) is handled independently, without shared state.
- **No Session Management**: No persistent connections or session tracking; every request is authenticated via metadata.
- **Ephemeral Operation**: PicoD runs only for the lifetime of the sandbox container and does not track lifecycle events.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
- **Ephemeral Operation**: PicoD runs only for the lifetime of the sandbox container and does not track lifecycle events.
- **Ephemeral Operation**: PicoD runs only during the lifetime of the sandbox container and does not track sandbox lifecycle events.


### Machine Learning Workflow

An AI agent performs a complete machine learning workflow - uploading data, installing dependencies, training a model, and downloading results:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The use case seems not very appropriate, taining is not our primary use case


### Core API Endpoints

1. **POST /api/execute** - Execute commands
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So does this mean you wonot provide a separate run code interface? And it work,s like we currently implement run_code calling execute_command in sdk


1. **POST /api/execute** - Execute commands
2. **POST /api/files** - Upload files
3. **GET /api/files/{path}** - Download files
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we use query parameter to specify complex file path

like api/files?file={path}

### Core API Endpoints

1. **POST /api/execute** - Execute commands
2. **POST /api/files** - Upload files
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

any details how to set the filepath to store and how the file content is sent

I think this should be HTTP multipart

HTTPServer[HTTP Server<br/>Port: 9527]
AuthMiddleware[Auth Middleware]
LogMiddleware[Logging Middleware]
ErrorMiddleware[Error Handler]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

? error handler

@VanderChen VanderChen force-pushed the picod-design branch 2 times, most recently from 088725e to 18cdd20 Compare November 25, 2025 07:35
Signed-off-by: VanderChen <vanderchen@outlook.com>
Comment on lines +113 to +116
1. **POST /tools/code-interpreter/execute** - Execute commands
2. **POST /tools/code-interpreter/files** - Upload files
3. **GET /tools/code-interpreter/files/{path}** - Download files
4. **GET /tools/code-interpreter/health** - Health check endpoint
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should simplify by removing code-interpreter, maybe
/v1/execute
/v1/files

Comment on lines +201 to +202
- Request: JSON with command, timeout, env vars
- Response: JSON with stdout, stderr, exit_code
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

give an json example

```json
{
"auth_type": "token|keypair",
"token": "eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9...",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what is this token? If this is the default token used by picode to auth agent-gateway, we should take it in header

{
"auth_type": "token|keypair",
"token": "eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9...",
"public_key": "-----BEGIN PUBLIC KEY-----\n...",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this used by picod to auth client?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

kind/documentation Improvements or additions to documentation kind/feature size/XL

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Enhance code intepreter

5 participants