implement code interpreter sandbox warmpool #45

YaoZengzeng · 2025-11-28T08:49:55Z

fixes: #30

Signed-off-by: YaoZengzeng <yaozengzeng@huawei.com>

volcano-sh-bot · 2025-11-28T08:50:02Z

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

OWNERS

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

gemini-code-assist · 2025-11-28T08:50:31Z

Summary of Changes

Hello @YaoZengzeng, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request significantly extends the system's capabilities by introducing a dedicated CodeInterpreter custom resource. This resource provides a higher-level abstraction for defining and managing secure, isolated code execution environments, specifically tailored for LLM-generated code. It integrates with the existing agent-sandbox framework to provision and maintain sandboxes, including the ability to pre-warm instances for improved performance. The changes also include the necessary infrastructure for API definition, code generation, and controller logic to make this new resource fully functional.

Highlights

New CodeInterpreter CRD: Introduced a new Custom Resource Definition (CRD) called CodeInterpreter under the runtime.agentcube.io/v1alpha1 API group. This CRD allows defining and managing isolated code execution environments.
CodeInterpreter Controller: Implemented a new controller, CodeInterpreterReconciler, responsible for reconciling CodeInterpreter resources. This controller creates and manages SandboxTemplate and SandboxWarmPool resources from the agent-sandbox project based on the CodeInterpreter's specification.
Warm Pool Management: The CodeInterpreter now supports specifying a warmPoolSize to maintain a pool of pre-warmed sandboxes, which helps reduce startup latency for new code interpretation sessions. The controller actively manages the corresponding SandboxWarmPool.
Automated Code Generation: Added a new hack/update-codegen.sh script and updated the Makefile to automate the generation of client-go code, deepcopy methods, and CRD manifests for the new CodeInterpreter API.
RBAC and API Server Updates: Updated the API server's main function to register the new runtimev1alpha1 scheme and integrated the CodeInterpreterReconciler. Corresponding RBAC rules and API server permissions were added to support the new resources.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

gemini-code-assist

Code Review

This pull request introduces a new CodeInterpreter CRD and its associated controller. This is a significant feature addition that allows defining and managing code interpreter runtimes, including warm pools for faster startup. The changes include new API types, a controller, code generation scripts, and Kubernetes manifests.

Overall, the implementation follows standard practices for Kubernetes controllers. The use of controller-runtime and code generation is appropriate. The new CodeInterpreter CRD is well-defined.

I've identified a few areas for improvement:

There are some redundant or unnecessary files (crds/_.yaml, crds/codeinterpreter.yaml, k8s/codeinterpreter-rbac.yaml) that should be cleaned up.
The CRD definition has a minor issue with Required and default markers that could cause confusion.
The controller logic has opportunities for improvement in terms of correctness (status reporting, template comparison) and efficiency (listing resources).

My detailed comments provide specific suggestions for these points. Addressing them will make the implementation more robust and maintainable.

gemini-code-assist · 2025-11-28T08:53:10Z

pkg/apis/runtime/v1alpha1/codeinterpreter_types.go

+	// +kubebuilder:validation:Required
+	// +kubebuilder:default="15m"


The SessionTimeout field is marked with +kubebuilder:validation:Required but also has a +kubebuilder:default value. This combination can lead to confusing behavior, as client-side validation (like kubectl apply) may fail if this field is omitted, even though the API server would apply a default. To make this field truly optional with a default value, the Required validation marker should be removed.

Suggested change

// +kubebuilder:validation:Required

// +kubebuilder:default="15m"

// +kubebuilder:default="15m"

gemini-code-assist · 2025-11-28T08:53:11Z

pkg/apis/runtime/v1alpha1/codeinterpreter_types.go

+	// +kubebuilder:validation:Required
+	// +kubebuilder:default="8h"


Similar to SessionTimeout, the MaxSessionDuration field is marked as Required but also has a default. This can cause validation issues. The +kubebuilder:validation:Required marker should be removed to make it an optional field with a default value.

Suggested change

// +kubebuilder:validation:Required

// +kubebuilder:default="8h"

// +kubebuilder:default="8h"

gemini-code-assist · 2025-11-28T08:53:11Z

pkg/apiserver/codeinterpreter_controller.go

+func (r *CodeInterpreterReconciler) podTemplateEqual(a, b sandboxv1alpha1.PodTemplate) bool {
+	// Simple comparison - in production, you might want a more sophisticated comparison
+	if len(a.Spec.Containers) == 0 || len(b.Spec.Containers) == 0 {
+		return len(a.Spec.Containers) == len(b.Spec.Containers)
+	}
+	if a.Spec.Containers[0].Image != b.Spec.Containers[0].Image {
+		return false
+	}
+	if (a.Spec.RuntimeClassName == nil) != (b.Spec.RuntimeClassName == nil) {
+		return false
+	}
+	if a.Spec.RuntimeClassName != nil && b.Spec.RuntimeClassName != nil {
+		return *a.Spec.RuntimeClassName == *b.Spec.RuntimeClassName
+	}
+	return true
+}


The podTemplateEqual function performs a very basic comparison and, as the comment notes, is not robust. It only checks the container image and runtime class name. This will fail to detect changes to other important fields like command, args, environment variables, or resource requests/limits. This could lead to the SandboxTemplate not being updated when the CodeInterpreter spec changes. You should use a more comprehensive comparison, such as reflect.DeepEqual, to ensure all changes are detected. You will need to import the reflect package.

func (r *CodeInterpreterReconciler) podTemplateEqual(a, b sandboxv1alpha1.PodTemplate) bool { // Use reflect.DeepEqual for a comprehensive comparison. return reflect.DeepEqual(a.Spec, b.Spec) }

gemini-code-assist · 2025-11-28T08:53:11Z

crds/_.yaml

+---
+apiVersion: apiextensions.k8s.io/v1
+kind: CustomResourceDefinition
+metadata:
+  annotations:
+    controller-gen.kubebuilder.io/version: v0.17.2
+spec:
+  group: ""
+  names:
+    kind: ""
+    plural: ""
+  scope: ""
+  versions: null


This file appears to be an empty, placeholder Custom Resource Definition, likely generated by controller-gen by mistake. It should be removed to avoid clutter and confusion in the project's CRD directory.

gemini-code-assist · 2025-11-28T08:53:11Z

crds/codeinterpreter.yaml

+apiVersion: apiextensions.k8s.io/v1
+kind: CustomResourceDefinition
+metadata:
+  name: codeinterpreters.runtime.agentcube.io
+  labels:
+    app: agentcube
+spec:
+  group: runtime.agentcube.io
+  versions:
+  - name: v1alpha1
+    served: true
+    storage: true
+    schema:
+      openAPIV3Schema:
+        type: object
+        properties:
+          spec:
+            type: object
+            properties:
+              ports:
+                type: array
+                items:
+                  type: object
+                  properties:
+                    pathPrefix:
+                      type: string
+                    name:
+                      type: string
+                    port:
+                      type: integer
+                      format: int32
+                    protocol:
+                      type: string
+                      enum:
+                      - HTTP
+                      - HTTPS
+                      default: HTTP
+                  required:
+                  - port
+                  - protocol
+              template:
+                type: object
+                required:
+                - image
+                properties:
+                  labels:
+                    type: object
+                    additionalProperties:
+                      type: string
+                  annotations:
+                    type: object
+                    additionalProperties:
+                      type: string
+                  runtimeClassName:
+                    type: string
+                  image:
+                    type: string
+                  environment:
+                    type: array
+                    items:
+                      type: object
+                      properties:
+                        name:
+                          type: string
+                        value:
+                          type: string
+                        valueFrom:
+                          type: object
+                      required:
+                      - name
+                  command:
+                    type: array
+                    items:
+                      type: string
+                  args:
+                    type: array
+                    items:
+                      type: string
+                  resources:
+                    type: object
+                    properties:
+                      limits:
+                        type: object
+                        additionalProperties:
+                          type: string
+                      requests:
+                        type: object
+                        additionalProperties:
+                          type: string
+              sessionTimeout:
+                type: string
+                default: "15m"
+              maxSessionDuration:
+                type: string
+                default: "8h"
+              languages:
+                type: array
+                items:
+                  type: string
+              warmPoolSize:
+                type: integer
+                format: int32
+            required:
+            - template
+          status:
+            type: object
+            properties:
+              conditions:
+                type: array
+                items:
+                  type: object
+                  properties:
+                    type:
+                      type: string
+                    status:
+                      type: string
+                    reason:
+                      type: string
+                    message:
+                      type: string
+                    lastTransitionTime:
+                      type: string
+                      format: date-time
+                    observedGeneration:
+                      type: integer
+                      format: int64
+              ready:
+                type: boolean
+              activeSessions:
+                type: integer
+                format: int32
+              warmPoolReady:
+                type: integer
+                format: int32
+    subresources:
+      status: {}
+  scope: Namespaced
+  names:
+    plural: codeinterpreters
+    singular: codeinterpreter
+    kind: CodeInterpreter
+    shortNames:
+    - ci
+


This CRD definition for CodeInterpreter seems to be a duplicate of crds/runtime.agentcube.io_codeinterpreters.yaml. The other file is more detailed, includes descriptions, and seems to be the one generated by controller-gen. To avoid confusion and maintain a single source of truth for the CRD, this file should be removed in favor of the generated one.

gemini-code-assist · 2025-11-28T08:53:11Z

k8s/codeinterpreter-rbac.yaml

+apiVersion: rbac.authorization.k8s.io/v1
+kind: ClusterRole
+metadata:
+  name: codeinterpreter-controller
+rules:
+  - apiGroups: ["runtime.agentcube.io"]
+    resources: ["codeinterpreters"]
+    verbs: ["get", "list", "watch", "create", "update", "patch", "delete"]
+  - apiGroups: ["runtime.agentcube.io"]
+    resources: ["codeinterpreters/status"]
+    verbs: ["get", "update", "patch"]
+  - apiGroups: ["runtime.agentcube.io"]
+    resources: ["codeinterpreters/finalizers"]
+    verbs: ["update"]
+  - apiGroups: ["agents.x-k8s.io"]
+    resources: ["sandboxes"]
+    verbs: ["get", "list", "watch", "create", "update", "patch", "delete"]
+---
+apiVersion: rbac.authorization.k8s.io/v1
+kind: ClusterRoleBinding
+metadata:
+  name: codeinterpreter-controller
+roleRef:
+  apiGroup: rbac.authorization.k8s.io
+  kind: ClusterRole
+  name: codeinterpreter-controller
+subjects:
+  - kind: ServiceAccount
+    name: agentcube-apiserver
+    namespace: agentcube
+


The ClusterRole and ClusterRoleBinding defined in this file appear to be redundant. The permissions required by the code interpreter controller are already included in the agentcube-apiserver ClusterRole within k8s/agentcube-apiserver.yaml. To simplify RBAC management and avoid duplication, this file should be removed.

gemini-code-assist · 2025-11-28T08:53:11Z

pkg/apiserver/codeinterpreter_controller.go

+	// Update status
+	ci.Status.ActiveSessions = activeSessions
+	ci.Status.WarmPoolReady = warmPoolReady
+	ci.Status.Ready = true


The Ready status is unconditionally set to true. This might not accurately reflect the state of the CodeInterpreter. For instance, if a warm pool is configured (WarmPoolSize > 0), the CodeInterpreter should probably only be considered ready when the warm pool has the desired number of ready sandboxes. Consider making this logic more robust by checking the status of dependent resources.

gemini-code-assist · 2025-11-28T08:53:11Z

pkg/apiserver/codeinterpreter_controller.go

+func (r *CodeInterpreterReconciler) countActiveSandboxes(ctx context.Context, ci *runtimev1alpha1.CodeInterpreter) (int32, error) {
+	sandboxList := &sandboxv1alpha1.SandboxList{}
+	if err := r.List(ctx, sandboxList, client.InNamespace(ci.Namespace)); err != nil {
+		return 0, err
+	}
+
+	count := int32(0)
+	for _, sandbox := range sandboxList.Items {
+		// Check if sandbox is using this code interpreter runtime
+		// This is determined by labels or annotations
+		if sandbox.Labels != nil {
+			if runtimeName, ok := sandbox.Labels["codeinterpreter.runtime.agentcube.io/name"]; ok {
+				if runtimeName == ci.Name {
+					// Check if sandbox is running by checking Ready condition
+					for _, condition := range sandbox.Status.Conditions {
+						if condition.Type == string(sandboxv1alpha1.SandboxConditionReady) && condition.Status == metav1.ConditionTrue {
+							count++
+							break
+						}
+					}
+				}
+			}
+		}
+	}
+
+	return count, nil
+}


Listing all sandboxes in the namespace and then filtering them on the client side is inefficient, especially in a cluster with many sandboxes. You can improve performance by using a label selector in the List call. This also simplifies the function body.

func (r *CodeInterpreterReconciler) countActiveSandboxes(ctx context.Context, ci *runtimev1alpha1.CodeInterpreter) (int32, error) { sandboxList := &sandboxv1alpha1.SandboxList{} listOpts := []client.ListOption{ client.InNamespace(ci.Namespace), client.MatchingLabels(map[string]string{ "codeinterpreter.runtime.agentcube.io/name": ci.Name, }), } if err := r.List(ctx, sandboxList, listOpts...); err != nil { return 0, err } var count int32 for _, sandbox := range sandboxList.Items { // Check if sandbox is running by checking Ready condition for _, condition := range sandbox.Status.Conditions { if condition.Type == string(sandboxv1alpha1.SandboxConditionReady) && condition.Status == metav1.ConditionTrue { count++ break } } } return count, nil }

Signed-off-by: YaoZengzeng <yaozengzeng@huawei.com>

YaoZengzeng · 2025-11-29T02:40:29Z

@hzxuzhonghu ptal

hzxuzhonghu

Blocking issue should be api group

runtime.agentcube.volcano.sh

And in order to integrate with workload manager's sandbox manager module, you should also provide a cache to support template or warmpool query

hzxuzhonghu · 2025-11-29T07:43:28Z

cmd/apiserver/main.go

+
+	// Setup CodeInterpreter controller
+	if err := ctrl.NewControllerManagedBy(mgr).
+		For(&runtimev1alpha1.CodeInterpreter{}).


This should be in the workload manager later

hzxuzhonghu · 2025-11-29T07:53:05Z

pkg/apis/runtime/v1alpha1/codeinterpreter_types.go

 	// routers / UIs select a compatible runtime.
 	// +optional
-	Languages []string `json:"languages,omitempty" protobuf:"bytes,6,rep,name=languages"`
+	Languages []string `json:"languages,omitempty"`


We can remove this first. I am not thinking well about this

hzxuzhonghu · 2025-11-29T07:54:21Z

pkg/apis/runtime/v1alpha1/codeinterpreter_types.go

+
+	// Ready indicates whether the CodeInterpreter is ready to serve requests
+	// +optional
+	Ready bool `json:"ready,omitempty"`


Seems hard to set set it. When we have multi managers

hzxuzhonghu · 2025-11-29T07:55:19Z

pkg/apis/runtime/v1alpha1/codeinterpreter_types.go

+
+	// ActiveSessions is the number of active sandbox sessions using this runtime
+	// +optional
+	ActiveSessions int32 `json:"activeSessions,omitempty"`


Remove first, this may cause too frequently k8s update

hzxuzhonghu · 2025-11-29T07:55:29Z

pkg/apis/runtime/v1alpha1/codeinterpreter_types.go

+
+	// WarmPoolReady is the number of sandboxes ready in the warm pool
+	// +optional
+	WarmPoolReady int32 `json:"warmPoolReady,omitempty"`


hzxuzhonghu · 2025-11-29T07:57:19Z

pkg/apiserver/codeinterpreter_controller.go

+	}
+
+	// Ensure SandboxTemplate exists
+	if err := r.ensureSandboxTemplate(ctx, codeInterpreter); err != nil {


This is not a must if nbo warmpool required

hzxuzhonghu · 2025-11-29T08:06:17Z

pkg/apiserver/codeinterpreter_controller.go

+	}
+
+	// Requeue periodically to check status
+	return ctrl.Result{RequeueAfter: 1 * time.Minute}, nil


This is not a good way? Can we also watch warmpool, i am also concerned without watching, you still need get List some resources from kube-apiserver

Signed-off-by: YaoZengzeng <yaozengzeng@huawei.com>

YaoZengzeng · 2025-11-29T09:28:31Z

@hzxuzhonghu updated

YaoZengzeng added 2 commits November 28, 2025 15:56

framework for code interpreter

df70624

Signed-off-by: YaoZengzeng <yaozengzeng@huawei.com>

runnable

8d37bda

Signed-off-by: YaoZengzeng <yaozengzeng@huawei.com>

volcano-sh-bot added the do-not-merge/work-in-progress label Nov 28, 2025

volcano-sh-bot added the size/XXL label Nov 28, 2025

gemini-code-assist bot reviewed Nov 28, 2025

View reviewed changes

YaoZengzeng added 4 commits November 29, 2025 10:12

cleanup

a9f8a37

Signed-off-by: YaoZengzeng <yaozengzeng@huawei.com>

fix generate

cfde1b2

Signed-off-by: YaoZengzeng <yaozengzeng@huawei.com>

more cleanup

8faaa45

Signed-off-by: YaoZengzeng <yaozengzeng@huawei.com>

fix name

eaf90f7

Signed-off-by: YaoZengzeng <yaozengzeng@huawei.com>

YaoZengzeng changed the title ~~WIP: implement code interpreter and sandbox warmpool~~ implement code interpreter and sandbox warmpool Nov 29, 2025

volcano-sh-bot removed the do-not-merge/work-in-progress label Nov 29, 2025

hzxuzhonghu changed the title ~~implement code interpreter and sandbox warmpool~~ implement code interpreter sandbox warmpool Nov 29, 2025

hzxuzhonghu reviewed Nov 29, 2025

View reviewed changes

YaoZengzeng added 2 commits November 29, 2025 17:17

fix comments

f76adb4

Signed-off-by: YaoZengzeng <yaozengzeng@huawei.com>

Get method for code interpreter

992bb4d

Signed-off-by: YaoZengzeng <yaozengzeng@huawei.com>

		// +kubebuilder:validation:Required
		// +kubebuilder:default="15m"

		// +kubebuilder:validation:Required
		// +kubebuilder:default="8h"

implement code interpreter sandbox warmpool #45

Are you sure you want to change the base?

implement code interpreter sandbox warmpool #45

Uh oh!

Conversation

YaoZengzeng commented Nov 28, 2025

Uh oh!

volcano-sh-bot commented Nov 28, 2025

Uh oh!

gemini-code-assist bot commented Nov 28, 2025

Summary of Changes

Highlights

Footnotes

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist bot Nov 28, 2025

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Nov 28, 2025

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Nov 28, 2025

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Nov 28, 2025

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Nov 28, 2025

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Nov 28, 2025

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Nov 28, 2025

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Nov 28, 2025

Choose a reason for hiding this comment

Uh oh!

YaoZengzeng commented Nov 29, 2025

Uh oh!

hzxuzhonghu left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

hzxuzhonghu Nov 29, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

YaoZengzeng commented Nov 29, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

hzxuzhonghu Nov 29, 2025 •

edited

Loading