fix: add daemon-side model repackaging for Linux support#639
fix: add daemon-side model repackaging for Linux support#639doringeman wants to merge 1 commit intodocker:mainfrom
Conversation
Summary of ChangesHello @doringeman, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request addresses a limitation where the CLI on Linux cannot directly interact with the model store located within a Docker volume. It introduces new daemon-side API endpoints for model repackaging and export, enabling the daemon to perform these operations. The CLI has been updated to leverage these new daemon capabilities, allowing for seamless model management on Linux by delegating certain tasks to the daemon, thereby improving compatibility and user experience. Highlights
🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console. Changelog
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
There was a problem hiding this comment.
Hey - I've found 3 issues, and left some high level feedback:
- There are three separate
RepackageOptionstypes (desktop client, distribution client, manager) with overlapping fields; consider consolidating to a shared type or clearly separating API vs internal variants to avoid future drift. - The new GET route handler
handleModelGetActionmeanshandleGetModelis no longer wired to any route inrouteHandlers; ifhandleGetModelis now redundant, consider removing it or adding a comment about its remaining use cases to reduce confusion. - In
ExportModelandRepackageModelon the distribution client, you’re usingcontext.Background()and not propagating any caller context; if you expect long-running operations, consider threading acontext.Contextthrough so that exports/repackages can be cancelled.
Prompt for AI Agents
Please address the comments from this code review:
## Overall Comments
- There are three separate `RepackageOptions` types (desktop client, distribution client, manager) with overlapping fields; consider consolidating to a shared type or clearly separating API vs internal variants to avoid future drift.
- The new GET route handler `handleModelGetAction` means `handleGetModel` is no longer wired to any route in `routeHandlers`; if `handleGetModel` is now redundant, consider removing it or adding a comment about its remaining use cases to reduce confusion.
- In `ExportModel` and `RepackageModel` on the distribution client, you’re using `context.Background()` and not propagating any caller context; if you expect long-running operations, consider threading a `context.Context` through so that exports/repackages can be cancelled.
## Individual Comments
### Comment 1
<location> `cmd/cli/commands/package.go:241-245` </location>
<code_context>
cmd.PrintErrf("Reading model from store: %q\n", opts.fromModel)
- // Get the model from the local store
mdl, err := distClient.GetModel(opts.fromModel)
if err != nil {
- return nil, fmt.Errorf("get model from store: %w", err)
+ cmd.PrintErrf("Model not found in local store, fetching from daemon...\n")
+
+ mdl, result.distClient, result.cleanupFunc, err = fetchModelFromDaemon(ctx, cmd, client, opts.fromModel)
+ if err != nil {
+ return nil, fmt.Errorf("get model from store: %w", err)
</code_context>
<issue_to_address>
**issue (bug_risk):** Original distClient cleanup is skipped when falling back to fetching from the daemon, which can leak temporary resources.
In the `opts.fromModel` path, when `distClient.GetModel` fails and you switch to `fetchModelFromDaemon`, the original `cleanupFunc` from `constructDistClient` is never called. If that client holds a temp dir or similar resources, they’ll leak in this path. Before overwriting `result.distClient` / `result.cleanupFunc`, store and invoke the existing cleanup so those resources are released when falling back to the daemon.
</issue_to_address>
### Comment 2
<location> `cmd/cli/desktop/desktop.go:942-951` </location>
<code_context>
return nil
}
+
+func (c *Client) ExportModel(ctx context.Context, model string) (io.ReadCloser, error) {
+ exportPath := fmt.Sprintf("%s/%s/export", inference.ModelsPrefix, model)
+ req, err := http.NewRequestWithContext(ctx, http.MethodGet, c.modelRunner.URL(exportPath), nil)
+ if err != nil {
+ return nil, fmt.Errorf("failed to create request: %w", err)
+ }
+ req.Header.Set("User-Agent", "docker-model-cli/"+Version)
+
+ resp, err := c.modelRunner.Client().Do(req)
+ if err != nil {
+ return nil, c.handleQueryError(err, exportPath)
+ }
+
+ if resp.StatusCode == http.StatusNotFound {
+ resp.Body.Close()
+ return nil, errors.Wrap(ErrNotFound, model)
+ }
+ if resp.StatusCode != http.StatusOK {
+ body, _ := io.ReadAll(resp.Body)
+ resp.Body.Close()
+ return nil, fmt.Errorf("export failed with status %s: %s", resp.Status, string(body))
+ }
+
+ return resp.Body, nil
+}
+
</code_context>
<issue_to_address>
**suggestion (bug_risk):** ExportModel bypasses doRequestWithAuthContext, so it likely misses auth and standard desktop context headers.
Since this bypasses `doRequestWithAuthContext` and calls `c.modelRunner.Client().Do(req)` directly, it won’t pick up the shared auth, desktop-context headers, or standard error handling used elsewhere. To keep behavior consistent and avoid failures in auth-restricted environments, consider routing this through `doRequestWithAuthContext` and returning `resp.Body` on success.
Suggested implementation:
```golang
resp, err := c.doRequestWithAuthContext(ctx, req)
```
This change assumes that `doRequestWithAuthContext` is already defined on `Client` with a signature compatible with `(ctx context.Context, req *http.Request) (*http.Response, error)` and that it is responsible for adding auth/desktop headers and standard error handling. If its signature or behavior differ, you will need to adjust the call accordingly (e.g., parameter order or return types).
</issue_to_address>
### Comment 3
<location> `pkg/inference/models/http_handler.go:160` </location>
<code_context>
+
+func (h *HTTPHandler) handleExportModel(w http.ResponseWriter, r *http.Request, modelRef string) {
+ w.Header().Set("Content-Type", "application/x-tar")
+ w.Header().Set("Content-Disposition", fmt.Sprintf("attachment; filename=%q", modelRef+".tar"))
+
+ err := h.manager.Export(modelRef, w)
</code_context>
<issue_to_address>
**suggestion:** Using modelRef directly in the Content-Disposition filename can produce awkward or potentially confusing filenames with slashes/colons.
Because `modelRef` can include path-like segments and tags (e.g. `org/repo:tag`), the `filename` value may confuse clients or UIs that treat `/` specially. Consider deriving a safe basename (e.g. `path.Base(modelRef)`) and normalizing disallowed characters like `/` and `:` to `-` or `_` for the header value, while keeping the full reference only in logs or metadata.
Suggested implementation:
```golang
func (h *HTTPHandler) handleExportModel(w http.ResponseWriter, r *http.Request, modelRef string) {
w.Header().Set("Content-Type", "application/x-tar")
w.Header().Set("Content-Disposition", fmt.Sprintf("attachment; filename=%q", safeExportFilename(modelRef)))
err := h.manager.Export(modelRef, w)
if err != nil {
if errors.Is(err, distribution.ErrModelNotFound) {
http.Error(w, err.Error(), http.StatusNotFound)
return
}
h.log.Warnln("Error while exporting model:", err)
http.Error(w, err.Error(), http.StatusInternalServerError)
return
}
}
// safeExportFilename derives a safe filename for the exported model tarball from a model reference.
// It keeps the full modelRef for export semantics, but ensures the header filename is UI/FS-friendly.
func safeExportFilename(modelRef string) string {
// Use the last path segment as the base name (e.g. "org/repo:tag" -> "repo:tag").
base := path.Base(modelRef)
// Normalize potentially confusing characters for filenames.
// Colons (used for tags) are replaced with '-', and any remaining slashes are also normalized.
base = strings.ReplaceAll(base, ":", "-")
base = strings.ReplaceAll(base, "/", "-")
// Fall back to a generic name if we somehow end up empty.
if base == "" || base == "." || base == "/" {
base = "model"
}
return base + ".tar"
}
```
You’ll also need to update the imports at the top of `pkg/inference/models/http_handler.go`:
1. Add `path` and `strings` to the import list, for example:
- `import "path"`
- `import "strings"`
Make sure these are merged into the existing import block following the file’s import style (grouping with stdlib imports, etc.).
</issue_to_address>Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.
| mdl, err := distClient.GetModel(opts.fromModel) | ||
| if err != nil { | ||
| return nil, fmt.Errorf("get model from store: %w", err) | ||
| cmd.PrintErrf("Model not found in local store, fetching from daemon...\n") | ||
|
|
||
| mdl, result.distClient, result.cleanupFunc, err = fetchModelFromDaemon(ctx, cmd, client, opts.fromModel) |
There was a problem hiding this comment.
issue (bug_risk): Original distClient cleanup is skipped when falling back to fetching from the daemon, which can leak temporary resources.
In the opts.fromModel path, when distClient.GetModel fails and you switch to fetchModelFromDaemon, the original cleanupFunc from constructDistClient is never called. If that client holds a temp dir or similar resources, they’ll leak in this path. Before overwriting result.distClient / result.cleanupFunc, store and invoke the existing cleanup so those resources are released when falling back to the daemon.
| func (c *Client) ExportModel(ctx context.Context, model string) (io.ReadCloser, error) { | ||
| exportPath := fmt.Sprintf("%s/%s/export", inference.ModelsPrefix, model) | ||
| req, err := http.NewRequestWithContext(ctx, http.MethodGet, c.modelRunner.URL(exportPath), nil) | ||
| if err != nil { | ||
| return nil, fmt.Errorf("failed to create request: %w", err) | ||
| } | ||
| req.Header.Set("User-Agent", "docker-model-cli/"+Version) | ||
|
|
||
| resp, err := c.modelRunner.Client().Do(req) | ||
| if err != nil { |
There was a problem hiding this comment.
suggestion (bug_risk): ExportModel bypasses doRequestWithAuthContext, so it likely misses auth and standard desktop context headers.
Since this bypasses doRequestWithAuthContext and calls c.modelRunner.Client().Do(req) directly, it won’t pick up the shared auth, desktop-context headers, or standard error handling used elsewhere. To keep behavior consistent and avoid failures in auth-restricted environments, consider routing this through doRequestWithAuthContext and returning resp.Body on success.
Suggested implementation:
resp, err := c.doRequestWithAuthContext(ctx, req)This change assumes that doRequestWithAuthContext is already defined on Client with a signature compatible with (ctx context.Context, req *http.Request) (*http.Response, error) and that it is responsible for adding auth/desktop headers and standard error handling. If its signature or behavior differ, you will need to adjust the call accordingly (e.g., parameter order or return types).
|
|
||
| func (h *HTTPHandler) handleExportModel(w http.ResponseWriter, r *http.Request, modelRef string) { | ||
| w.Header().Set("Content-Type", "application/x-tar") | ||
| w.Header().Set("Content-Disposition", fmt.Sprintf("attachment; filename=%q", modelRef+".tar")) |
There was a problem hiding this comment.
suggestion: Using modelRef directly in the Content-Disposition filename can produce awkward or potentially confusing filenames with slashes/colons.
Because modelRef can include path-like segments and tags (e.g. org/repo:tag), the filename value may confuse clients or UIs that treat / specially. Consider deriving a safe basename (e.g. path.Base(modelRef)) and normalizing disallowed characters like / and : to - or _ for the header value, while keeping the full reference only in logs or metadata.
Suggested implementation:
func (h *HTTPHandler) handleExportModel(w http.ResponseWriter, r *http.Request, modelRef string) {
w.Header().Set("Content-Type", "application/x-tar")
w.Header().Set("Content-Disposition", fmt.Sprintf("attachment; filename=%q", safeExportFilename(modelRef)))
err := h.manager.Export(modelRef, w)
if err != nil {
if errors.Is(err, distribution.ErrModelNotFound) {
http.Error(w, err.Error(), http.StatusNotFound)
return
}
h.log.Warnln("Error while exporting model:", err)
http.Error(w, err.Error(), http.StatusInternalServerError)
return
}
}
// safeExportFilename derives a safe filename for the exported model tarball from a model reference.
// It keeps the full modelRef for export semantics, but ensures the header filename is UI/FS-friendly.
func safeExportFilename(modelRef string) string {
// Use the last path segment as the base name (e.g. "org/repo:tag" -> "repo:tag").
base := path.Base(modelRef)
// Normalize potentially confusing characters for filenames.
// Colons (used for tags) are replaced with '-', and any remaining slashes are also normalized.
base = strings.ReplaceAll(base, ":", "-")
base = strings.ReplaceAll(base, "/", "-")
// Fall back to a generic name if we somehow end up empty.
if base == "" || base == "." || base == "/" {
base = "model"
}
return base + ".tar"
}You’ll also need to update the imports at the top of pkg/inference/models/http_handler.go:
- Add
pathandstringsto the import list, for example:import "path"import "strings"
Make sure these are merged into the existing import block following the file’s import style (grouping with stdlib imports, etc.).
When packaging a model with --from on Linux, the CLI cannot access the model store directly since it's in a Docker volume. This adds a /models/{name}/repackage endpoint that allows the daemon to perform config-only changes (like context-size) server-side.
For simple cases (--from with only --context-size), the CLI now calls the daemon's repackage API directly. For complex cases (adding licenses, pushing to registry), it falls back to exporting the model as a tarball.
Fixes docker#633
Signed-off-by: Dorin Geman <dorin.geman@docker.com>
88c55de to
c0186c1
Compare
There was a problem hiding this comment.
Code Review
This pull request introduces daemon-side model repackaging to support config-only changes on Linux, where the CLI cannot directly access the model store. A new /models/{name}/repackage endpoint is added for simple cases, with a fallback to exporting the model for more complex scenarios. The changes are well-implemented, with clear logic in both the CLI and the daemon to handle these two paths. My review focuses on improving maintainability by addressing duplicated data structures across several packages and a suggestion to make error handling more robust.
| type RepackageOptions struct { | ||
| ContextSize *uint64 `json:"context_size,omitempty"` | ||
| } |
There was a problem hiding this comment.
The RepackageOptions struct is duplicated across multiple packages (cmd/cli/desktop, pkg/distribution/distribution, pkg/inference/models). This can lead to inconsistencies and maintenance issues. Consider defining this struct once in a shared package (e.g., a new api package or within pkg/inference/models) and reusing it across the client and server components. The same applies to RepackageRequest in pkg/inference/models/http_handler.go.
| type RepackageOptions struct { | ||
| ContextSize *uint64 | ||
| } |
There was a problem hiding this comment.
| type RepackageRequest struct { | ||
| Target string `json:"target"` | ||
| ContextSize *uint64 `json:"context_size,omitempty"` | ||
| } |
There was a problem hiding this comment.
This RepackageRequest struct is closely related to RepackageOptions defined in other packages. To improve maintainability and reduce duplication, consider defining these related data structures in a single, shared package (e.g., a new api package or within pkg/inference/models itself). This would ensure consistency between the client and server.
| type RepackageOptions struct { | ||
| ContextSize *uint64 `json:"context_size,omitempty"` | ||
| } |
| tempClient, err := distribution.NewClient(distribution.WithStoreRootPath(tempDir)) | ||
| if err != nil { | ||
| cleanup() | ||
| return nil, nil, nil, fmt.Errorf("create temp distribution client: %w", err) | ||
| } |
There was a problem hiding this comment.
The cleanup() function is called manually in each error path. This is repetitive and error-prone, as a new error path could be added without the cleanup call. Consider using a deferred function with a success flag to ensure cleanup is always performed on failure. This would make the error handling more robust.
For example:
var success bool
defer func() {
if !success {
cleanup()
}
}()
// ... your logic ...
success = true
return mdl, tempClient, cleanup, nil
When packaging a model with --from on Linux, the CLI cannot access the model store directly since it's in a Docker volume. This adds a /models/{name}/repackage endpoint that allows the daemon to perform config-only changes (like context-size) server-side.
For simple cases (--from with only --context-size), the CLI now calls the daemon's repackage API directly. For complex cases (adding licenses, pushing to registry), it falls back to exporting the model as a tarball.
Fixes #633
CC @ilopezluna This is a fairly quick fix. Eventually we can deduplicate some logic existing both in the CLI and inside these new handlers from the backend.