Skip to content

fix: add daemon-side model repackaging for Linux support#639

Open
doringeman wants to merge 1 commit intodocker:mainfrom
doringeman:fix-package
Open

fix: add daemon-side model repackaging for Linux support#639
doringeman wants to merge 1 commit intodocker:mainfrom
doringeman:fix-package

Conversation

@doringeman
Copy link
Contributor

When packaging a model with --from on Linux, the CLI cannot access the model store directly since it's in a Docker volume. This adds a /models/{name}/repackage endpoint that allows the daemon to perform config-only changes (like context-size) server-side.

For simple cases (--from with only --context-size), the CLI now calls the daemon's repackage API directly. For complex cases (adding licenses, pushing to registry), it falls back to exporting the model as a tarball.

Fixes #633

CC @ilopezluna This is a fairly quick fix. Eventually we can deduplicate some logic existing both in the CLI and inside these new handlers from the backend.


make docker-run
MODEL_RUNNER_HOST=http://localhost:8080 docker model pull smollm2
MODEL_RUNNER_HOST=http://localhost:8080 docker model package --from ai/smollm2 --context-size 1024 my-smollm2
MODEL_RUNNER_HOST=http://localhost:8080 docker model run my-smollm2 hi

@gemini-code-assist
Copy link
Contributor

Summary of Changes

Hello @doringeman, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request addresses a limitation where the CLI on Linux cannot directly interact with the model store located within a Docker volume. It introduces new daemon-side API endpoints for model repackaging and export, enabling the daemon to perform these operations. The CLI has been updated to leverage these new daemon capabilities, allowing for seamless model management on Linux by delegating certain tasks to the daemon, thereby improving compatibility and user experience.

Highlights

  • Daemon-side Model Repackaging: Introduced a new API endpoint (/models/{name}/repackage) on the daemon to allow server-side modification of model configurations, such as context-size, addressing limitations with CLI direct access on Linux.
  • Daemon-side Model Export: Added an API endpoint (/models/{name}/export) to allow the daemon to export a model as a tarball, which the CLI can then consume, facilitating model transfer without direct volume access.
  • Intelligent CLI Logic for Linux Compatibility: The CLI now intelligently uses these new daemon-side capabilities. For simple repackaging (e.g., only changing context-size), it directly calls the daemon's repackage API. For more complex operations (e.g., adding licenses, pushing to registry), it falls back to exporting the model from the daemon as a tarball and processing it locally.

🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Changelog
  • cmd/cli/commands/package.go
    • Updated initializeBuilder to accept context.Context and *desktop.Client.
    • Implemented fetchModelFromDaemon to retrieve models from the daemon if not found locally, involving export and temporary local loading.
    • Added conditional logic in packageModel to use daemon-side repackaging for simple --context-size changes.
    • Modified the call to initializeBuilder to pass new arguments.
  • cmd/cli/desktop/desktop.go
    • Added ExportModel client method to interact with the daemon's new export endpoint.
    • Introduced RepackageOptions struct for daemon-side repackaging.
    • Added RepackageModel client method to interact with the daemon's new repackage endpoint.
  • pkg/distribution/distribution/client.go
    • Imported mutate package for model modification.
    • Added ExportModel to write a model from the store as a tarball.
    • Added RepackageOptions and RepackageModel to apply configuration changes (like ContextSize) to a model and save a lightweight version.
  • pkg/inference/models/http_handler.go
    • Refactored model GET routes to handle actions like "export" via handleModelGetAction.
    • Added handleExportModel to expose the daemon's model export functionality via HTTP.
    • Introduced handleRepackageModel to expose the daemon's model repackaging functionality via HTTP.
    • Updated handleModelAction to include the new repackage action.
  • pkg/inference/models/manager.go
    • Added Export method to delegate model export to the distribution client.
    • Added Repackage method to delegate model repackaging to the distribution client.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Contributor

@sourcery-ai sourcery-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey - I've found 3 issues, and left some high level feedback:

  • There are three separate RepackageOptions types (desktop client, distribution client, manager) with overlapping fields; consider consolidating to a shared type or clearly separating API vs internal variants to avoid future drift.
  • The new GET route handler handleModelGetAction means handleGetModel is no longer wired to any route in routeHandlers; if handleGetModel is now redundant, consider removing it or adding a comment about its remaining use cases to reduce confusion.
  • In ExportModel and RepackageModel on the distribution client, you’re using context.Background() and not propagating any caller context; if you expect long-running operations, consider threading a context.Context through so that exports/repackages can be cancelled.
Prompt for AI Agents
Please address the comments from this code review:

## Overall Comments
- There are three separate `RepackageOptions` types (desktop client, distribution client, manager) with overlapping fields; consider consolidating to a shared type or clearly separating API vs internal variants to avoid future drift.
- The new GET route handler `handleModelGetAction` means `handleGetModel` is no longer wired to any route in `routeHandlers`; if `handleGetModel` is now redundant, consider removing it or adding a comment about its remaining use cases to reduce confusion.
- In `ExportModel` and `RepackageModel` on the distribution client, you’re using `context.Background()` and not propagating any caller context; if you expect long-running operations, consider threading a `context.Context` through so that exports/repackages can be cancelled.

## Individual Comments

### Comment 1
<location> `cmd/cli/commands/package.go:241-245` </location>
<code_context>
 		cmd.PrintErrf("Reading model from store: %q\n", opts.fromModel)

-		// Get the model from the local store
 		mdl, err := distClient.GetModel(opts.fromModel)
 		if err != nil {
-			return nil, fmt.Errorf("get model from store: %w", err)
+			cmd.PrintErrf("Model not found in local store, fetching from daemon...\n")
+
+			mdl, result.distClient, result.cleanupFunc, err = fetchModelFromDaemon(ctx, cmd, client, opts.fromModel)
+			if err != nil {
+				return nil, fmt.Errorf("get model from store: %w", err)
</code_context>

<issue_to_address>
**issue (bug_risk):** Original distClient cleanup is skipped when falling back to fetching from the daemon, which can leak temporary resources.

In the `opts.fromModel` path, when `distClient.GetModel` fails and you switch to `fetchModelFromDaemon`, the original `cleanupFunc` from `constructDistClient` is never called. If that client holds a temp dir or similar resources, they’ll leak in this path. Before overwriting `result.distClient` / `result.cleanupFunc`, store and invoke the existing cleanup so those resources are released when falling back to the daemon.
</issue_to_address>

### Comment 2
<location> `cmd/cli/desktop/desktop.go:942-951` </location>
<code_context>
 	return nil
 }
+
+func (c *Client) ExportModel(ctx context.Context, model string) (io.ReadCloser, error) {
+	exportPath := fmt.Sprintf("%s/%s/export", inference.ModelsPrefix, model)
+	req, err := http.NewRequestWithContext(ctx, http.MethodGet, c.modelRunner.URL(exportPath), nil)
+	if err != nil {
+		return nil, fmt.Errorf("failed to create request: %w", err)
+	}
+	req.Header.Set("User-Agent", "docker-model-cli/"+Version)
+
+	resp, err := c.modelRunner.Client().Do(req)
+	if err != nil {
+		return nil, c.handleQueryError(err, exportPath)
+	}
+
+	if resp.StatusCode == http.StatusNotFound {
+		resp.Body.Close()
+		return nil, errors.Wrap(ErrNotFound, model)
+	}
+	if resp.StatusCode != http.StatusOK {
+		body, _ := io.ReadAll(resp.Body)
+		resp.Body.Close()
+		return nil, fmt.Errorf("export failed with status %s: %s", resp.Status, string(body))
+	}
+
+	return resp.Body, nil
+}
+
</code_context>

<issue_to_address>
**suggestion (bug_risk):** ExportModel bypasses doRequestWithAuthContext, so it likely misses auth and standard desktop context headers.

Since this bypasses `doRequestWithAuthContext` and calls `c.modelRunner.Client().Do(req)` directly, it won’t pick up the shared auth, desktop-context headers, or standard error handling used elsewhere. To keep behavior consistent and avoid failures in auth-restricted environments, consider routing this through `doRequestWithAuthContext` and returning `resp.Body` on success.

Suggested implementation:

```golang
	resp, err := c.doRequestWithAuthContext(ctx, req)

```

This change assumes that `doRequestWithAuthContext` is already defined on `Client` with a signature compatible with `(ctx context.Context, req *http.Request) (*http.Response, error)` and that it is responsible for adding auth/desktop headers and standard error handling. If its signature or behavior differ, you will need to adjust the call accordingly (e.g., parameter order or return types).
</issue_to_address>

### Comment 3
<location> `pkg/inference/models/http_handler.go:160` </location>
<code_context>
+
+func (h *HTTPHandler) handleExportModel(w http.ResponseWriter, r *http.Request, modelRef string) {
+	w.Header().Set("Content-Type", "application/x-tar")
+	w.Header().Set("Content-Disposition", fmt.Sprintf("attachment; filename=%q", modelRef+".tar"))
+
+	err := h.manager.Export(modelRef, w)
</code_context>

<issue_to_address>
**suggestion:** Using modelRef directly in the Content-Disposition filename can produce awkward or potentially confusing filenames with slashes/colons.

Because `modelRef` can include path-like segments and tags (e.g. `org/repo:tag`), the `filename` value may confuse clients or UIs that treat `/` specially. Consider deriving a safe basename (e.g. `path.Base(modelRef)`) and normalizing disallowed characters like `/` and `:` to `-` or `_` for the header value, while keeping the full reference only in logs or metadata.

Suggested implementation:

```golang
func (h *HTTPHandler) handleExportModel(w http.ResponseWriter, r *http.Request, modelRef string) {
	w.Header().Set("Content-Type", "application/x-tar")
	w.Header().Set("Content-Disposition", fmt.Sprintf("attachment; filename=%q", safeExportFilename(modelRef)))

	err := h.manager.Export(modelRef, w)
	if err != nil {
		if errors.Is(err, distribution.ErrModelNotFound) {
			http.Error(w, err.Error(), http.StatusNotFound)
			return
		}
		h.log.Warnln("Error while exporting model:", err)
		http.Error(w, err.Error(), http.StatusInternalServerError)
		return
	}
}

// safeExportFilename derives a safe filename for the exported model tarball from a model reference.
// It keeps the full modelRef for export semantics, but ensures the header filename is UI/FS-friendly.
func safeExportFilename(modelRef string) string {
	// Use the last path segment as the base name (e.g. "org/repo:tag" -> "repo:tag").
	base := path.Base(modelRef)

	// Normalize potentially confusing characters for filenames.
	// Colons (used for tags) are replaced with '-', and any remaining slashes are also normalized.
	base = strings.ReplaceAll(base, ":", "-")
	base = strings.ReplaceAll(base, "/", "-")

	// Fall back to a generic name if we somehow end up empty.
	if base == "" || base == "." || base == "/" {
		base = "model"
	}

	return base + ".tar"
}

```

You’ll also need to update the imports at the top of `pkg/inference/models/http_handler.go`:

1. Add `path` and `strings` to the import list, for example:
   - `import "path"`
   - `import "strings"`

Make sure these are merged into the existing import block following the file’s import style (grouping with stdlib imports, etc.).
</issue_to_address>

Sourcery is free for open source - if you like our reviews please consider sharing them ✨
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.

Comment on lines 241 to +245
mdl, err := distClient.GetModel(opts.fromModel)
if err != nil {
return nil, fmt.Errorf("get model from store: %w", err)
cmd.PrintErrf("Model not found in local store, fetching from daemon...\n")

mdl, result.distClient, result.cleanupFunc, err = fetchModelFromDaemon(ctx, cmd, client, opts.fromModel)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

issue (bug_risk): Original distClient cleanup is skipped when falling back to fetching from the daemon, which can leak temporary resources.

In the opts.fromModel path, when distClient.GetModel fails and you switch to fetchModelFromDaemon, the original cleanupFunc from constructDistClient is never called. If that client holds a temp dir or similar resources, they’ll leak in this path. Before overwriting result.distClient / result.cleanupFunc, store and invoke the existing cleanup so those resources are released when falling back to the daemon.

Comment on lines 942 to 951
func (c *Client) ExportModel(ctx context.Context, model string) (io.ReadCloser, error) {
exportPath := fmt.Sprintf("%s/%s/export", inference.ModelsPrefix, model)
req, err := http.NewRequestWithContext(ctx, http.MethodGet, c.modelRunner.URL(exportPath), nil)
if err != nil {
return nil, fmt.Errorf("failed to create request: %w", err)
}
req.Header.Set("User-Agent", "docker-model-cli/"+Version)

resp, err := c.modelRunner.Client().Do(req)
if err != nil {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion (bug_risk): ExportModel bypasses doRequestWithAuthContext, so it likely misses auth and standard desktop context headers.

Since this bypasses doRequestWithAuthContext and calls c.modelRunner.Client().Do(req) directly, it won’t pick up the shared auth, desktop-context headers, or standard error handling used elsewhere. To keep behavior consistent and avoid failures in auth-restricted environments, consider routing this through doRequestWithAuthContext and returning resp.Body on success.

Suggested implementation:

	resp, err := c.doRequestWithAuthContext(ctx, req)

This change assumes that doRequestWithAuthContext is already defined on Client with a signature compatible with (ctx context.Context, req *http.Request) (*http.Response, error) and that it is responsible for adding auth/desktop headers and standard error handling. If its signature or behavior differ, you will need to adjust the call accordingly (e.g., parameter order or return types).


func (h *HTTPHandler) handleExportModel(w http.ResponseWriter, r *http.Request, modelRef string) {
w.Header().Set("Content-Type", "application/x-tar")
w.Header().Set("Content-Disposition", fmt.Sprintf("attachment; filename=%q", modelRef+".tar"))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion: Using modelRef directly in the Content-Disposition filename can produce awkward or potentially confusing filenames with slashes/colons.

Because modelRef can include path-like segments and tags (e.g. org/repo:tag), the filename value may confuse clients or UIs that treat / specially. Consider deriving a safe basename (e.g. path.Base(modelRef)) and normalizing disallowed characters like / and : to - or _ for the header value, while keeping the full reference only in logs or metadata.

Suggested implementation:

func (h *HTTPHandler) handleExportModel(w http.ResponseWriter, r *http.Request, modelRef string) {
	w.Header().Set("Content-Type", "application/x-tar")
	w.Header().Set("Content-Disposition", fmt.Sprintf("attachment; filename=%q", safeExportFilename(modelRef)))

	err := h.manager.Export(modelRef, w)
	if err != nil {
		if errors.Is(err, distribution.ErrModelNotFound) {
			http.Error(w, err.Error(), http.StatusNotFound)
			return
		}
		h.log.Warnln("Error while exporting model:", err)
		http.Error(w, err.Error(), http.StatusInternalServerError)
		return
	}
}

// safeExportFilename derives a safe filename for the exported model tarball from a model reference.
// It keeps the full modelRef for export semantics, but ensures the header filename is UI/FS-friendly.
func safeExportFilename(modelRef string) string {
	// Use the last path segment as the base name (e.g. "org/repo:tag" -> "repo:tag").
	base := path.Base(modelRef)

	// Normalize potentially confusing characters for filenames.
	// Colons (used for tags) are replaced with '-', and any remaining slashes are also normalized.
	base = strings.ReplaceAll(base, ":", "-")
	base = strings.ReplaceAll(base, "/", "-")

	// Fall back to a generic name if we somehow end up empty.
	if base == "" || base == "." || base == "/" {
		base = "model"
	}

	return base + ".tar"
}

You’ll also need to update the imports at the top of pkg/inference/models/http_handler.go:

  1. Add path and strings to the import list, for example:
    • import "path"
    • import "strings"

Make sure these are merged into the existing import block following the file’s import style (grouping with stdlib imports, etc.).

When packaging a model with --from on Linux, the CLI cannot access the model store directly since it's in a Docker volume. This adds a /models/{name}/repackage endpoint that allows the daemon to perform config-only changes (like context-size) server-side.

For simple cases (--from with only --context-size), the CLI now calls the daemon's repackage API directly. For complex cases (adding licenses, pushing to registry), it falls back to exporting the model as a tarball.

Fixes docker#633

Signed-off-by: Dorin Geman <dorin.geman@docker.com>
Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces daemon-side model repackaging to support config-only changes on Linux, where the CLI cannot directly access the model store. A new /models/{name}/repackage endpoint is added for simple cases, with a fallback to exporting the model for more complex scenarios. The changes are well-implemented, with clear logic in both the CLI and the daemon to handle these two paths. My review focuses on improving maintainability by addressing duplicated data structures across several packages and a suggestion to make error handling more robust.

Comment on lines +968 to +970
type RepackageOptions struct {
ContextSize *uint64 `json:"context_size,omitempty"`
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The RepackageOptions struct is duplicated across multiple packages (cmd/cli/desktop, pkg/distribution/distribution, pkg/inference/models). This can lead to inconsistencies and maintenance issues. Consider defining this struct once in a shared package (e.g., a new api package or within pkg/inference/models) and reusing it across the client and server components. The same applies to RepackageRequest in pkg/inference/models/http_handler.go.

Comment on lines +642 to +644
type RepackageOptions struct {
ContextSize *uint64
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

This is another instance of the duplicated RepackageOptions struct. This one doesn't have the JSON tags, which highlights the risk of inconsistency. This should be consolidated into a single definition in a shared package to avoid duplication and potential bugs.

Comment on lines +473 to +476
type RepackageRequest struct {
Target string `json:"target"`
ContextSize *uint64 `json:"context_size,omitempty"`
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

This RepackageRequest struct is closely related to RepackageOptions defined in other packages. To improve maintainability and reduce duplication, consider defining these related data structures in a single, shared package (e.g., a new api package or within pkg/inference/models itself). This would ensure consistency between the client and server.

Comment on lines +427 to +429
type RepackageOptions struct {
ContextSize *uint64 `json:"context_size,omitempty"`
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

This is another instance of the duplicated RepackageOptions struct. To improve maintainability and prevent inconsistencies, this struct should be defined once in a shared package and reused across all components that need it.

Comment on lines +328 to +332
tempClient, err := distribution.NewClient(distribution.WithStoreRootPath(tempDir))
if err != nil {
cleanup()
return nil, nil, nil, fmt.Errorf("create temp distribution client: %w", err)
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The cleanup() function is called manually in each error path. This is repetitive and error-prone, as a new error path could be added without the cleanup call. Consider using a deferred function with a success flag to ensure cleanup is always performed on failure. This would make the error handling more robust.

For example:

	var success bool
	defer func() {
		if !success {
			cleanup()
		}
	}()

	// ... your logic ...

	success = true
	return mdl, tempClient, cleanup, nil

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Packaging model from store fails with "model not found reference" when using docker model package

1 participant