-
Notifications
You must be signed in to change notification settings - Fork 287
Download safetensors models from huggingface.co with https. #2224
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Reviewer's GuideRefactors the HuggingFace transport to prefer manifest-based GGUF metadata but fall back to HTTPS-based safetensors discovery and download via the HuggingFace Files API, while removing all reliance on the huggingface CLI and extending the HF-style repo base to support safetensors, auxiliary files, and a safetensors index file. File-Level Changes
Assessment against linked issues
Possibly linked issues
Tips and commandsInteracting with Sourcery
Customizing Your ExperienceAccess your dashboard to:
Getting Help
|
Summary of ChangesHello @jwieleRH, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request significantly improves the reliability and flexibility of model downloads by enabling direct HTTPS fetching for safetensors models from Hugging Face. It introduces a more robust metadata retrieval mechanism that gracefully handles both GGUF and safetensors model types, including support for sharded models and auxiliary files. The changes also remove the dependency on the problematic Highlights
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey there - I've reviewed your changes - here's some feedback:
- In
fetch_repo_files, the/tree/{revision}API is queried only once with default parameters; if a model repo has many files or nested directories you may miss entries unless you either handle pagination or recurse over directories returned astype == 'directory'. - In
_fetch_safetensors_metadata, you storeoidfor each file but then recompute checksums viafetch_checksum_from_api; consider either usingoiddirectly where appropriate or clarifying why both are needed to avoid confusion and redundant API calls. - For
safetensors_index_fileinget_file_list, thehashis computed withgenerate_sha256(f"{org}/{name}/{file}"), which is not the file’s content hash; if thishashis used for integrity checks it should come from the same checksum mechanism as other files or explicitly disable checksum verification for the index file.
Prompt for AI Agents
Please address the comments from this code review:
## Overall Comments
- In `fetch_repo_files`, the `/tree/{revision}` API is queried only once with default parameters; if a model repo has many files or nested directories you may miss entries unless you either handle pagination or recurse over directories returned as `type == 'directory'`.
- In `_fetch_safetensors_metadata`, you store `oid` for each file but then recompute checksums via `fetch_checksum_from_api`; consider either using `oid` directly where appropriate or clarifying why both are needed to avoid confusion and redundant API calls.
- For `safetensors_index_file` in `get_file_list`, the `hash` is computed with `generate_sha256(f"{org}/{name}/{file}")`, which is not the file’s content hash; if this `hash` is used for integrity checks it should come from the same checksum mechanism as other files or explicitly disable checksum verification for the index file.
## Individual Comments
### Comment 1
<location> `ramalama/transports/huggingface.py:79-88` </location>
<code_context>
return json.loads(repo_manifest)
+def fetch_repo_files(repo_name: str, revision: str = "main"):
+ """Fetch the list of files in a HuggingFace repository using the Files API."""
+ token = huggingface_token()
+ api_url = f"https://huggingface.co/api/models/{repo_name}/tree/{revision}"
+ logger.debug(f"Fetching repo files from {api_url}")
+
+ request = urllib.request.Request(
+ url=api_url,
+ headers={
+ 'Accept': 'application/json',
+ },
+ )
+ if token is not None:
+ request.add_header('Authorization', f"Bearer {token}")
+
+ with urllib.request.urlopen(request) as response:
+ files_data = response.read().decode('utf-8')
+ return json.loads(files_data)
+
+
</code_context>
<issue_to_address>
**issue:** The HuggingFace Files API call does not handle pagination, which can truncate results for large repos.
This endpoint paginates results, so this implementation will only ever see the first page and may miss `.safetensors` or other files in larger models. Please either support pagination (e.g., following any `cursor`/`next` fields) or clearly document that this helper is only safe for small repos and enforce that assumption in callers.
</issue_to_address>
### Comment 2
<location> `ramalama/transports/huggingface.py:125` </location>
<code_context>
+ # No ggufFile in manifest
+ return False
+
+ def _fetch_safetensors_metadata(self):
+ """Fetch metadata for safetensors models from HuggingFace API."""
+ repo_name = f"{self.organization}/{self.name}"
</code_context>
<issue_to_address>
**issue (complexity):** Consider simplifying `_fetch_safetensors_metadata` and related helpers by using simpler data structures, an ignore-file helper, and a unified checksum hook to make the HuggingFace metadata flow easier to follow.
You can keep the new behavior but trim some of the new structure to make the code easier to follow.
### 1. Avoid dicts for `safetensors_files` / `other_files`
Right now `_fetch_safetensors_metadata` builds lists of dicts with `{'filename': ..., 'oid': ...}`, but only `filename` is actually used later (for sorting and choosing the main file). You can simplify this to lists of strings and keep `oid` out of the hot path until it’s really needed.
```python
def _fetch_safetensors_metadata(self):
repo_name = f"{self.organization}/{self.name}"
try:
files = fetch_repo_files(repo_name)
except (urllib.error.HTTPError, json.JSONDecodeError, KeyError) as e:
logger.debug(f'fetch_repo_files failed: {e}')
return False
self.safetensors_files: list[str] = []
self.other_files: list[str] = []
index_file: str | None = None
for file_info in files:
if file_info.get('type') != 'file':
continue
path = file_info['path']
logger.debug(f"Examining file {path}")
if fnmatch.fnmatch(path, "*.safetensors"):
self.safetensors_files.append(path)
elif fnmatch.fnmatch(path, "*.safetensors.index.json"):
index_file = path
elif path in {".gitattributes", ".gitignore", "README.md"}:
continue
elif path in {
self.FILE_NAME_CONFIG,
self.FILE_NAME_GENERATION_CONFIG,
self.FILE_NAME_TOKENIZER_CONFIG,
}:
continue
elif fnmatch.fnmatch(path, "*.txt"):
continue
else:
self.other_files.append(path)
if not self.safetensors_files:
logger.debug("No safetensors files found")
return False
# Sort by name for consistent ordering.
self.safetensors_files.sort()
self.model_filename = self.safetensors_files[0]
repo_path = f"{self.organization}/{self.name}"
self.model_hash = f"sha256:{fetch_checksum_from_api(repo_path, self.model_filename)}"
self.additional_safetensor_files = self.safetensors_files[1:]
self.safetensors_index_file = index_file
return True
```
If at some point you really need `oid`, you can add a separate `self.file_oids: dict[str, str]` mapping keyed by filename instead of pushing `oid` into every list element.
### 2. Extract a small helper for “ignored” files
The sequence of `elif` branches for “ignore this file” is what makes the loop visually long. A small helper keeps the main logic easy to scan:
```python
def _is_ignored_file(self, path: str) -> bool:
if path in {".gitattributes", ".gitignore", "README.md"}:
return True
if path in {
self.FILE_NAME_CONFIG,
self.FILE_NAME_GENERATION_CONFIG,
self.FILE_NAME_TOKENIZER_CONFIG,
}:
return True
if fnmatch.fnmatch(path, "*.txt"):
return True
return False
```
Then the loop becomes more linear:
```python
for file_info in files:
if file_info.get('type') != 'file':
continue
path = file_info['path']
logger.debug(f"Examining file {path}")
if self._is_ignored_file(path):
continue
if fnmatch.fnmatch(path, "*.safetensors"):
self.safetensors_files.append(path)
elif fnmatch.fnmatch(path, "*.safetensors.index.json"):
index_file = path
else:
self.other_files.append(path)
```
Same behavior, but the classification logic is easier to read and adjust.
### 3. Unify the checksum hook usage
You already have `_fetch_checksum_for_file` in the base `HFStyleRepository` (called from `get_file_list`) and another one here on the subclass. To lower mental overhead, you can concentrate all HF-specific checksum behavior in a single override, and let `fetch_metadata` reuse it.
For example, if the base class calls `_fetch_checksum_for_file(self, filename)`:
```python
class HuggingfaceRepository(HFStyleRepository):
...
def _fetch_checksum_for_file(self, filename: str) -> str:
# Skip JSON files during checksum fetch
if fnmatch.fnmatch(filename, "*.json"):
return ""
repo_path = f"{self.organization}/{self.name}"
return fetch_checksum_from_api(repo_path, filename)
```
Then in `_fetch_safetensors_metadata` you can rely on the same hook instead of calling `fetch_checksum_from_api` directly:
```python
self.model_hash = f"sha256:{self._fetch_checksum_for_file(self.model_filename)}"
```
That way, the base class remains the single place that “knows” when/where checksums are needed, and the subclass only defines how to get one, without duplicating extra rules in multiple methods.
</issue_to_address>
### Comment 3
<location> `ramalama/hf_style_repo_base.py:84` </location>
<code_context>
self.model_hash = None
self.mmproj_filename = None
self.mmproj_hash = None
+ self.other_files: list[dict] = []
+ self.additional_safetensor_files: list[dict] = []
+ self.safetensors_index_file = None
</code_context>
<issue_to_address>
**issue (complexity):** Consider unifying extra file metadata and SnapshotFile construction so get_file_list becomes a simpler, declarative description of what to download rather than how to build each entry.
You can reduce the added complexity by (1) consolidating file state into a single structure and (2) centralizing `SnapshotFile` construction and checksum handling.
### 1. Consolidate `other_files` / `additional_safetensor_files` / `safetensors_index_file`
Instead of three separate attributes with slightly different semantics, use a single list of “extra” files with metadata. This keeps state in one place and makes `get_file_list` simpler:
```python
from dataclasses import dataclass
from enum import Enum
class ExtraFileKind(Enum):
SAFETENSOR = "safetensor"
OTHER = "other"
INDEX = "index"
@dataclass
class ExtraFile:
filename: str
kind: ExtraFileKind
required: bool = True
should_verify_checksum: bool = True
should_show_progress: bool = True
class BaseModel(ABC):
def __init__(self, name: str, organization: str, tag: str = 'latest'):
...
self.extra_files: list[ExtraFile] = []
self.fetch_metadata()
```
Your existing `additional_safetensor_files`, `other_files`, and `safetensors_index_file` assignments in `fetch_metadata()` can be converted into `ExtraFile` instances and appended to `self.extra_files` instead of populating multiple attributes.
### 2. Extract `SnapshotFile` construction into a helper
`get_file_list` currently has multiple, almost-identical `SnapshotFile(...)` constructions. Pull that into a single helper that also standardizes checksum handling:
```python
def _snapshot_file_for(
self,
filename: str,
file_type: SnapshotFileType,
required: bool = True,
should_show_progress: bool = True,
should_verify_checksum: bool = True,
use_custom_checksum: bool = True,
) -> SnapshotFile:
checksum = ""
if should_verify_checksum and use_custom_checksum:
checksum = self._fetch_checksum_for_file(filename)
elif should_verify_checksum and not use_custom_checksum:
checksum = generate_sha256(filename)
return SnapshotFile(
url=f"{self.blob_url}/{filename}",
header=self.headers,
hash=f"sha256:{checksum}" if checksum else "",
type=file_type,
name=filename,
required=required,
should_show_progress=should_show_progress,
should_verify_checksum=bool(checksum) and should_verify_checksum,
)
```
Then `get_file_list` can be simplified and made more uniform:
```python
def get_file_list(self, cached_files: list[str]) -> list[SnapshotFile]:
files: list[SnapshotFile] = []
if self.model_filename not in cached_files:
files.append(self.model_file())
# existing mmproj / configs unchanged...
for extra in self.extra_files:
if extra.filename in cached_files:
continue
if extra.kind is ExtraFileKind.SAFETENSOR:
file_type = SnapshotFileType.SafetensorModel
use_custom_checksum = True
else:
file_type = SnapshotFileType.Other
use_custom_checksum = extra.kind is not ExtraFileKind.INDEX
files.append(
self._snapshot_file_for(
filename=extra.filename,
file_type=file_type,
required=extra.required,
should_show_progress=extra.should_show_progress,
should_verify_checksum=extra.should_verify_checksum,
use_custom_checksum=use_custom_checksum,
)
)
return files
```
This removes the separate loops for `additional_safetensor_files`, `other_files`, and the index file, and it keeps the logic declarative: `extra_files` describe *what* to download; `_snapshot_file_for` describes *how*.
### 3. Normalize checksum handling
Instead of:
- `_fetch_checksum_for_file` used only for some files, and
- `self.model_hash` being set directly in subclasses, and
- `generate_sha256(f".../{index_file}")` as a special case,
route everything through `_fetch_checksum_for_file` and a single code path.
```python
def _fetch_checksum_for_file(self, filename: str) -> str:
"""Return checksum (without 'sha256:' prefix) for a given filename.
Base impl returns empty string (no checksum). Subclasses override.
"""
return ""
```
Then update `model_file` to use the same hook:
```python
def model_file(self) -> SnapshotFile:
assert self.model_filename
file_type = (
SnapshotFileType.SafetensorModel
if fnmatch.fnmatch(self.model_filename, "*.safetensors")
else SnapshotFileType.GGUFModel
)
checksum = self._fetch_checksum_for_file(self.model_filename)
hash_value = f"sha256:{checksum}" if checksum else self.model_hash
return SnapshotFile(
url=f"{self.blob_url}/{self.model_filename}",
header=self.headers,
hash=hash_value,
type=file_type,
name=self.model_filename,
should_show_progress=True,
should_verify_checksum=bool(hash_value),
)
```
And for index files, make the “no real checksum” explicit instead of generating a path-based hash:
```python
# when constructing ExtraFile for the index
self.extra_files.append(
ExtraFile(
filename=self.safetensors_index_file,
kind=ExtraFileKind.INDEX,
required=False,
should_verify_checksum=False,
should_show_progress=False,
)
)
```
With the above changes:
- State is centralized in `extra_files` rather than three attributes.
- `get_file_list` is shorter and reads as “what files we want” instead of “how to build each snapshot.”
- All checksums go through `_fetch_checksum_for_file` or are explicitly marked as “no checksum,” eliminating special-case hashing for the index file.
</issue_to_address>
### Comment 4
<location> `ramalama/hf_style_repo_base.py:162-174` </location>
<code_context>
if self.safetensors_index_file:
if self.safetensors_index_file not in cached_files:
logger.debug(f"Adding safetensors index file: {self.safetensors_index_file}")
files.append(
SnapshotFile(
url=f"{self.blob_url}/{self.safetensors_index_file}",
header=self.headers,
hash=generate_sha256(f"{self.organization}/{self.name}/{self.safetensors_index_file}"),
type=SnapshotFileType.Other,
name=self.safetensors_index_file,
required=False,
)
)
</code_context>
<issue_to_address>
**suggestion (code-quality):** Merge nested if conditions ([`merge-nested-ifs`](https://docs.sourcery.ai/Reference/Rules-and-In-Line-Suggestions/Python/Default-Rules/merge-nested-ifs))
```suggestion
if self.safetensors_index_file and self.safetensors_index_file not in cached_files:
logger.debug(f"Adding safetensors index file: {self.safetensors_index_file}")
files.append(
SnapshotFile(
url=f"{self.blob_url}/{self.safetensors_index_file}",
header=self.headers,
hash=generate_sha256(f"{self.organization}/{self.name}/{self.safetensors_index_file}"),
type=SnapshotFileType.Other,
name=self.safetensors_index_file,
required=False,
)
)
```
<br/><details><summary>Explanation</summary>Too much nesting can make code difficult to understand, and this is especially
true in Python, where there are no brackets to help out with the delineation of
different nesting levels.
Reading deeply nested code is confusing, since you have to keep track of which
conditions relate to which levels. We therefore strive to reduce nesting where
possible, and the situation where two `if` conditions can be combined using
`and` is an easy win.
</details>
</issue_to_address>Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.
| def fetch_repo_files(repo_name: str, revision: str = "main"): | ||
| """Fetch the list of files in a HuggingFace repository using the Files API.""" | ||
| token = huggingface_token() | ||
| api_url = f"https://huggingface.co/api/models/{repo_name}/tree/{revision}" | ||
| logger.debug(f"Fetching repo files from {api_url}") | ||
|
|
||
| request = urllib.request.Request( | ||
| url=api_url, | ||
| headers={ | ||
| 'Accept': 'application/json', |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
issue: The HuggingFace Files API call does not handle pagination, which can truncate results for large repos.
This endpoint paginates results, so this implementation will only ever see the first page and may miss .safetensors or other files in larger models. Please either support pagination (e.g., following any cursor/next fields) or clearly document that this helper is only safe for small repos and enforce that assumption in callers.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request is a great step forward in improving how models are downloaded from Hugging Face, by moving from the CLI to direct HTTPS requests via the API. This removes an external dependency and adds support for safetensors models, which is a significant enhancement.
My review focuses on a few areas to improve robustness and clarity. I've found a critical issue related to hash generation that could lead to cache corruption, and I've provided a detailed suggestion to fix it. I also have a couple of smaller suggestions regarding type hinting and file filtering logic to make the code more maintainable and prevent potential issues.
Overall, the changes are well-structured and address an important feature. With a few adjustments, this will be a solid contribution.
| if filename not in cached_files: | ||
| logger.debug(f"Adding other safetensors file: {filename}") | ||
| files.append( | ||
| SnapshotFile( | ||
| url=f"{self.blob_url}/{filename}", | ||
| header=self.headers, | ||
| hash=f"sha256:{self._fetch_checksum_for_file(filename)}", | ||
| type=SnapshotFileType.Other, | ||
| name=filename, | ||
| should_show_progress=False, | ||
| should_verify_checksum=False, | ||
| ) | ||
| ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's a critical bug here that can lead to file corruption in the cache. When _fetch_checksum_for_file(filename) returns an empty string (e.g., for JSON files), the hash becomes "sha256:". This causes all such files to collide and overwrite each other in the blob store.
Additionally, the log message Adding other safetensors file is misleading as these files aren't necessarily safetensors.
To fix this, you should generate a unique hash for files without an API-provided checksum (similar to config files) and correct the log message.
| if filename not in cached_files: | |
| logger.debug(f"Adding other safetensors file: {filename}") | |
| files.append( | |
| SnapshotFile( | |
| url=f"{self.blob_url}/{filename}", | |
| header=self.headers, | |
| hash=f"sha256:{self._fetch_checksum_for_file(filename)}", | |
| type=SnapshotFileType.Other, | |
| name=filename, | |
| should_show_progress=False, | |
| should_verify_checksum=False, | |
| ) | |
| ) | |
| if filename not in cached_files: | |
| logger.debug(f"Adding other file: {filename}") | |
| checksum = self._fetch_checksum_for_file(filename) | |
| file_hash = f"sha256:{checksum}" if checksum else generate_sha256(f"{self.organization}/{self.name}/{filename}") | |
| files.append( | |
| SnapshotFile( | |
| url=f"{self.blob_url}/{filename}", | |
| header=self.headers, | |
| hash=file_hash, | |
| type=SnapshotFileType.Other, | |
| name=filename, | |
| should_show_progress=False, | |
| should_verify_checksum=False, | |
| ) | |
| ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The whole checksum system needs to be updated. Files that are lfs blobs should use the blob sha signature. Other files should either hash the contents, or hash the oid, or use the oid directly (which would require relaxing the restriction that hash has to be 256 bits).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jwieleRH Isn't the oid provided in the response when fetching the file list? Can't we use that one instead of re-fetching?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ping @jwieleRH
ramalama/hf_style_repo_base.py
Outdated
| self.mmproj_hash = None | ||
| self.other_files: list[dict] = [] | ||
| self.additional_safetensor_files: list[dict] = [] | ||
| self.safetensors_index_file = None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Type hints are being gradually applied everywhere.
ramalama/transports/huggingface.py
Outdated
| elif fnmatch.fnmatch(path, '*.txt'): | ||
| continue |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The logic to unconditionally skip all *.txt files seems a bit broad. While this might be intended to skip LICENSE or other text files, it could also unintentionally skip important files like requirements.txt which might be necessary for some models. Could you clarify the reasoning for this or perhaps make the filtering more specific to the files you intend to exclude?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This would be addressed by fixing the checksum system.
|
@olliewalsh @engelmi PTAL |
engelmi
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A few comments, but overall looks good. Nice work!
ramalama/transports/huggingface.py
Outdated
| path = file_info['path'] | ||
| logger.debug(f"Examining file {path}") | ||
|
|
||
| if fnmatch.fnmatch(path, '*.safetensors'): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A simple path.endswith(".safetenstors") seems sufficient and would remove the dependency on fnmatch (which I am not sure is usable on windows?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Personally, I would rather use fnmatch or os.path.splitext, both of which are explicitly about path names and work fine on Windows, while endswith is about strings. As a practical matter, splitext may for all I know be implemented by an implicit endswith. So I don't feel strongly about it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fnmatch isn't path-aware, though, and on Windows there are some differences such as case-insensitivity and the path separators \ instead of /.
Was talking with @telemaco about using pathlib.Path everywhere so we have the semantics of paths everywhere - currently its pretty mixed. In this case it would be a simple p.suffix.lower() == ".safetensors"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if fact you could use path.glob("*.safetensors") . The .glob() method is supported by Pathlib
| if filename not in cached_files: | ||
| logger.debug(f"Adding other safetensors file: {filename}") | ||
| files.append( | ||
| SnapshotFile( | ||
| url=f"{self.blob_url}/{filename}", | ||
| header=self.headers, | ||
| hash=f"sha256:{self._fetch_checksum_for_file(filename)}", | ||
| type=SnapshotFileType.Other, | ||
| name=filename, | ||
| should_show_progress=False, | ||
| should_verify_checksum=False, | ||
| ) | ||
| ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jwieleRH Isn't the oid provided in the response when fetching the file list? Can't we use that one instead of re-fetching?
| # Determine file type based on extension | ||
| if fnmatch.fnmatch(self.model_filename, '*.safetensors'): | ||
| file_type = SnapshotFileType.SafetensorModel | ||
| else: | ||
| file_type = SnapshotFileType.GGUFModel |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
NIT: I'd remove the else and just default to gguf:
| # Determine file type based on extension | |
| if fnmatch.fnmatch(self.model_filename, '*.safetensors'): | |
| file_type = SnapshotFileType.SafetensorModel | |
| else: | |
| file_type = SnapshotFileType.GGUFModel | |
| # Determine file type based on extension | |
| file_type = SnapshotFileType.GGUFModel | |
| if fnmatch.fnmatch(self.model_filename, '*.safetensors'): | |
| file_type = SnapshotFileType.SafetensorModel |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That makes a certain amount of sense because we do look for GGUF first.
|
/ok-to-test |
Refactor the fetch_metadata method to attempt to fetch GGUF metadata from a manifest, and then to attempt to fetch safetensors metadata from the repo files. To fetch a safetensors model, download the set of files from the repo with https requests. Add an "other_files" category for safetensors files that are neither JSON config files nor .safetensors model files, such as the tokenizer.model file. Remove code to download models from huggingface.co with the huggingface cli. It didn't work correctly anyway. Stub out the _collect_cli_files and in_existing_cache methods for the huggingface transport. Enable bats test of downloading safetensors models from huggingface. Remove bats test of downloading models with the huggingface cli. Adjust test expectation for split safetensors model. Fixes: containers#1493 Signed-off-by: John Wiele <jwiele@redhat.com>
|
/gemini review |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request effectively refactors the Hugging Face transport to support downloading safetensors models via HTTPS, which is a great improvement. The fallback mechanism from GGUF manifest to safetensors file discovery is well-implemented. The removal of the broken huggingface-cli support cleans up the codebase. My main feedback is a suggestion to improve the consistency of handling the safetensors index file by not discarding its oid, which would align its processing with that of other repository files.
Refactor the fetch_metadata method to attempt to fetch GGUF metadata from a manifest, and then to attempt to fetch safetensors metadata from the repo files.
To fetch a safetensors model, download the set of files from the repo with https requests. Add an "other_files" category for safetensors files that are neither JSON config files nor .safetensors model files, such as the tokenizer.model file.
Remove code to download models from huggingface.co with the huggingface cli. It didn't work correctly anyway. Stub out the _collect_cli_files and in_existing_cache methods for the huggingface transport.
Enable bats test of downloading safetensors models from huggingface. Remove bats test of downloading models with the huggingface cli.
Fixes: #1493
Summary by Sourcery
Support downloading Hugging Face safetensors and associated files over HTTPS while deprecating CLI-based downloads and extending repository metadata handling.
New Features:
Bug Fixes:
Enhancements:
Tests: