Download safetensors models from huggingface.co with https. #2224

jwieleRH · 2025-12-08T13:09:25Z

Refactor the fetch_metadata method to attempt to fetch GGUF metadata from a manifest, and then to attempt to fetch safetensors metadata from the repo files.

To fetch a safetensors model, download the set of files from the repo with https requests. Add an "other_files" category for safetensors files that are neither JSON config files nor .safetensors model files, such as the tokenizer.model file.

Remove code to download models from huggingface.co with the huggingface cli. It didn't work correctly anyway. Stub out the _collect_cli_files and in_existing_cache methods for the huggingface transport.

Enable bats test of downloading safetensors models from huggingface. Remove bats test of downloading models with the huggingface cli.

Fixes: #1493

Summary by Sourcery

Support downloading Hugging Face safetensors and associated files over HTTPS while deprecating CLI-based downloads and extending repository metadata handling.

New Features:

Discover safetensors model, index, and auxiliary files from Hugging Face repositories using the Files API and download them via HTTPS.
Treat non-model, non-config artifacts (such as tokenizer and related files) as an explicit "other_files" category in snapshot generation.

Bug Fixes:

Remove broken support for downloading models via the Hugging Face CLI and stub related cache and collection pathways to avoid incorrect behavior.

Enhancements:

Fallback to safetensors-based metadata when a GGUF manifest is unavailable, and infer model type from file extensions when building snapshot entries.
Extend HF-style repository metadata and file list generation to track additional safetensor shards and index files, including appropriate checksum handling.

Tests:

Adjust system tests to cover safetensors downloads from Hugging Face and drop tests that relied on the Hugging Face CLI.

sourcery-ai · 2025-12-08T13:09:32Z

Reviewer's Guide

Refactors the HuggingFace transport to prefer manifest-based GGUF metadata but fall back to HTTPS-based safetensors discovery and download via the HuggingFace Files API, while removing all reliance on the huggingface CLI and extending the HF-style repo base to support safetensors, auxiliary files, and a safetensors index file.

File-Level Changes

Change	Details	Files
Add HTTPS-based discovery of safetensors and auxiliary files via the HuggingFace Files API and integrate this metadata into the repository model.	Introduce fetch_repo_files() to call the HuggingFace Files API and return the repo file tree as JSON. Implement HuggingfaceRepository._fetch_safetensors_metadata() to scan repo files for .safetensors, index JSON, and auxiliary files, recording them on the repository instance. Extend HFStyleRepository.get_file_list() to add safetensors shards, other auxiliary files, and a safetensors index file to the download plan, with appropriate snapshot metadata.	`ramalama/transports/huggingface.py` `ramalama/hf_style_repo_base.py`
Refactor metadata fetching to first use the GGUF manifest and then fall back to safetensors metadata, and generalize checksum fetching for arbitrary files.	Extract HuggingfaceRepository._fetch_manifest_metadata() to encapsulate manifest retrieval and GGUF field parsing with error handling and boolean success return. Update HuggingfaceRepository.fetch_metadata() to chain _fetch_manifest_metadata() and _fetch_safetensors_metadata(), raising if neither succeeds. Add _fetch_checksum_for_file() in the HF base class and override it in HuggingfaceRepository to use fetch_checksum_from_api for non-JSON files, wiring this into get_file_list() for safetensors and other files.	`ramalama/transports/huggingface.py` `ramalama/hf_style_repo_base.py`
Enhance HF-style repository model handling so safetensors can be treated as primary model files with correct snapshot typing.	Add other_files, additional_safetensor_files, and safetensors_index_file fields to HFStyleRepository instances and initialize them in init. Modify HFStyleRepository.model_file() to determine SnapshotFileType based on filename extension, treating .safetensors as SafetensorModel instead of GGUFModel. Ensure safetensors-related files are created with appropriate SnapshotFileType and checksum behavior in get_file_list().	`ramalama/hf_style_repo_base.py`
Remove all HuggingFace CLI-based download, cache reuse, and file collection logic from the HuggingFace transport.	Delete is_hf_cli_available(), SnapshotFileType imports, and create_file_link usage from the Huggingface transport. Remove the in_existing_cache() implementation that searched the local HF CLI cache and linked files, replacing it with a stub that always returns False. Replace get_cli_download_args() with a NotImplementedError and stub out _collect_cli_files() to return no files.	`ramalama/transports/huggingface.py`
Update system tests to validate HTTPS-based safetensors pulls and drop tests depending on the huggingface CLI.	Enable the system test that pulls a safetensors model from HuggingFace using HTTPS by removing the skip-if-no-hf-cli guard. Remove the hf cli cache-based pull test that pre-populated the HF cache and validated reuse through multiple transport syntaxes.	`test/system/050-pull.bats`

Assessment against linked issues

Issue	Objective	Addressed	Explanation
#1493	Remove or disable the huggingface-cli fallback logic for huggingface/llama-server style URIs so it no longer attempts to download via the CLI or its cache.	✅
#1493	Ensure that when pulling a llama-server style huggingface URI fails, the operation fails fast instead of falling back to a full model repo download via huggingface-cli.	✅

Possibly linked issues

Skip huggingface-cli fallback for llama-server style huggingface uris #1493: PR implements safetensors HTTPS downloads and ancillary files, fulfilling the requested safetensors pull support.
Skip huggingface-cli fallback for llama-server style huggingface uris #1493: PR removes huggingface-cli usage and implements direct HTTPS downloads from HuggingFace, satisfying the issue’s dependency-removal request.

Tips and commands

Interacting with Sourcery

Trigger a new review: Comment @sourcery-ai review on the pull request.
Continue discussions: Reply directly to Sourcery's review comments.
Generate a GitHub issue from a review comment: Ask Sourcery to create an
issue from a review comment by replying to it. You can also reply to a
review comment with @sourcery-ai issue to create an issue from it.
Generate a pull request title: Write @sourcery-ai anywhere in the pull
request title to generate a title at any time. You can also comment
@sourcery-ai title on the pull request to (re-)generate the title at any time.
Generate a pull request summary: Write @sourcery-ai summary anywhere in
the pull request body to generate a PR summary at any time exactly where you
want it. You can also comment @sourcery-ai summary on the pull request to
(re-)generate the summary at any time.
Generate reviewer's guide: Comment @sourcery-ai guide on the pull
request to (re-)generate the reviewer's guide at any time.
Resolve all Sourcery comments: Comment @sourcery-ai resolve on the
pull request to resolve all Sourcery comments. Useful if you've already
addressed all the comments and don't want to see them anymore.
Dismiss all Sourcery reviews: Comment @sourcery-ai dismiss on the pull
request to dismiss all existing Sourcery reviews. Especially useful if you
want to start fresh with a new review - don't forget to comment
@sourcery-ai review to trigger a new review!

Customizing Your Experience

Access your dashboard to:

Enable or disable review features such as the Sourcery-generated pull request
summary, the reviewer's guide, and others.
Change the review language.
Add, remove or edit custom review instructions.
Adjust other review settings.

Getting Help

Contact our support team for questions or feedback.
Visit our documentation for detailed guides and information.
Keep in touch with the Sourcery team by following us on X/Twitter, LinkedIn or GitHub.

gemini-code-assist · 2025-12-08T13:09:43Z

Summary of Changes

Hello @jwieleRH, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request significantly improves the reliability and flexibility of model downloads by enabling direct HTTPS fetching for safetensors models from Hugging Face. It introduces a more robust metadata retrieval mechanism that gracefully handles both GGUF and safetensors model types, including support for sharded models and auxiliary files. The changes also remove the dependency on the problematic huggingface-cli for downloads, leading to a more stable and efficient process.

Highlights

Safetensors Model Download via HTTPS: The system can now directly download safetensors models from huggingface.co using HTTPS requests, addressing previous issues with the huggingface-cli.
Refactored Metadata Fetching: The fetch_metadata method has been refactored to first attempt fetching GGUF metadata from a manifest, and if unsuccessful, it then attempts to fetch safetensors metadata by listing repository files.
Support for Sharded Safetensors and Other Files: New categories (other_files and additional_safetensor_files) have been introduced to properly handle non-model safetensor files (e.g., tokenizers) and sharded safetensors, ensuring all necessary components are downloaded.
Removal of Hugging Face CLI Dependency: Code related to downloading models using the huggingface-cli has been removed or stubbed out, as it was not functioning correctly, streamlining the download process.
Updated System Tests: System tests have been updated to enable testing of safetensors model downloads from Hugging Face and to remove tests that relied on the deprecated huggingface-cli functionality.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

sourcery-ai

Hey there - I've reviewed your changes - here's some feedback:

In fetch_repo_files, the /tree/{revision} API is queried only once with default parameters; if a model repo has many files or nested directories you may miss entries unless you either handle pagination or recurse over directories returned as type == 'directory'.
In _fetch_safetensors_metadata, you store oid for each file but then recompute checksums via fetch_checksum_from_api; consider either using oid directly where appropriate or clarifying why both are needed to avoid confusion and redundant API calls.
For safetensors_index_file in get_file_list, the hash is computed with generate_sha256(f"{org}/{name}/{file}"), which is not the file’s content hash; if this hash is used for integrity checks it should come from the same checksum mechanism as other files or explicitly disable checksum verification for the index file.

Prompt for AI Agents

Please address the comments from this code review:

## Overall Comments
- In `fetch_repo_files`, the `/tree/{revision}` API is queried only once with default parameters; if a model repo has many files or nested directories you may miss entries unless you either handle pagination or recurse over directories returned as `type == 'directory'`.
- In `_fetch_safetensors_metadata`, you store `oid` for each file but then recompute checksums via `fetch_checksum_from_api`; consider either using `oid` directly where appropriate or clarifying why both are needed to avoid confusion and redundant API calls.
- For `safetensors_index_file` in `get_file_list`, the `hash` is computed with `generate_sha256(f"{org}/{name}/{file}")`, which is not the file’s content hash; if this `hash` is used for integrity checks it should come from the same checksum mechanism as other files or explicitly disable checksum verification for the index file.

## Individual Comments

### Comment 1
<location> `ramalama/transports/huggingface.py:79-88` </location>
<code_context>
         return json.loads(repo_manifest)


+def fetch_repo_files(repo_name: str, revision: str = "main"):
+    """Fetch the list of files in a HuggingFace repository using the Files API."""
+    token = huggingface_token()
+    api_url = f"https://huggingface.co/api/models/{repo_name}/tree/{revision}"
+    logger.debug(f"Fetching repo files from {api_url}")
+
+    request = urllib.request.Request(
+        url=api_url,
+        headers={
+            'Accept': 'application/json',
+        },
+    )
+    if token is not None:
+        request.add_header('Authorization', f"Bearer {token}")
+
+    with urllib.request.urlopen(request) as response:
+        files_data = response.read().decode('utf-8')
+        return json.loads(files_data)
+
+
</code_context>

<issue_to_address>
**issue:** The HuggingFace Files API call does not handle pagination, which can truncate results for large repos.

This endpoint paginates results, so this implementation will only ever see the first page and may miss `.safetensors` or other files in larger models. Please either support pagination (e.g., following any `cursor`/`next` fields) or clearly document that this helper is only safe for small repos and enforce that assumption in callers.
</issue_to_address>

### Comment 2
<location> `ramalama/transports/huggingface.py:125` </location>
<code_context>
+            # No ggufFile in manifest
+            return False
+
+    def _fetch_safetensors_metadata(self):
+        """Fetch metadata for safetensors models from HuggingFace API."""
+        repo_name = f"{self.organization}/{self.name}"
</code_context>

<issue_to_address>
**issue (complexity):** Consider simplifying `_fetch_safetensors_metadata` and related helpers by using simpler data structures, an ignore-file helper, and a unified checksum hook to make the HuggingFace metadata flow easier to follow.

You can keep the new behavior but trim some of the new structure to make the code easier to follow.

### 1. Avoid dicts for `safetensors_files` / `other_files`

Right now `_fetch_safetensors_metadata` builds lists of dicts with `{'filename': ..., 'oid': ...}`, but only `filename` is actually used later (for sorting and choosing the main file). You can simplify this to lists of strings and keep `oid` out of the hot path until it’s really needed.

```python
def _fetch_safetensors_metadata(self):
    repo_name = f"{self.organization}/{self.name}"
    try:
        files = fetch_repo_files(repo_name)
    except (urllib.error.HTTPError, json.JSONDecodeError, KeyError) as e:
        logger.debug(f'fetch_repo_files failed: {e}')
        return False

    self.safetensors_files: list[str] = []
    self.other_files: list[str] = []
    index_file: str | None = None

    for file_info in files:
        if file_info.get('type') != 'file':
            continue

        path = file_info['path']
        logger.debug(f"Examining file {path}")

        if fnmatch.fnmatch(path, "*.safetensors"):
            self.safetensors_files.append(path)
        elif fnmatch.fnmatch(path, "*.safetensors.index.json"):
            index_file = path
        elif path in {".gitattributes", ".gitignore", "README.md"}:
            continue
        elif path in {
            self.FILE_NAME_CONFIG,
            self.FILE_NAME_GENERATION_CONFIG,
            self.FILE_NAME_TOKENIZER_CONFIG,
        }:
            continue
        elif fnmatch.fnmatch(path, "*.txt"):
            continue
        else:
            self.other_files.append(path)

    if not self.safetensors_files:
        logger.debug("No safetensors files found")
        return False

    # Sort by name for consistent ordering.
    self.safetensors_files.sort()

    self.model_filename = self.safetensors_files[0]
    repo_path = f"{self.organization}/{self.name}"
    self.model_hash = f"sha256:{fetch_checksum_from_api(repo_path, self.model_filename)}"

    self.additional_safetensor_files = self.safetensors_files[1:]
    self.safetensors_index_file = index_file

    return True
```

If at some point you really need `oid`, you can add a separate `self.file_oids: dict[str, str]` mapping keyed by filename instead of pushing `oid` into every list element.

### 2. Extract a small helper for “ignored” files

The sequence of `elif` branches for “ignore this file” is what makes the loop visually long. A small helper keeps the main logic easy to scan:

```python
def _is_ignored_file(self, path: str) -> bool:
    if path in {".gitattributes", ".gitignore", "README.md"}:
        return True
    if path in {
        self.FILE_NAME_CONFIG,
        self.FILE_NAME_GENERATION_CONFIG,
        self.FILE_NAME_TOKENIZER_CONFIG,
    }:
        return True
    if fnmatch.fnmatch(path, "*.txt"):
        return True
    return False
```

Then the loop becomes more linear:

```python
for file_info in files:
    if file_info.get('type') != 'file':
        continue

    path = file_info['path']
    logger.debug(f"Examining file {path}")

    if self._is_ignored_file(path):
        continue
    if fnmatch.fnmatch(path, "*.safetensors"):
        self.safetensors_files.append(path)
    elif fnmatch.fnmatch(path, "*.safetensors.index.json"):
        index_file = path
    else:
        self.other_files.append(path)
```

Same behavior, but the classification logic is easier to read and adjust.

### 3. Unify the checksum hook usage

You already have `_fetch_checksum_for_file` in the base `HFStyleRepository` (called from `get_file_list`) and another one here on the subclass. To lower mental overhead, you can concentrate all HF-specific checksum behavior in a single override, and let `fetch_metadata` reuse it.

For example, if the base class calls `_fetch_checksum_for_file(self, filename)`:

```python
class HuggingfaceRepository(HFStyleRepository):
    ...

    def _fetch_checksum_for_file(self, filename: str) -> str:
        # Skip JSON files during checksum fetch
        if fnmatch.fnmatch(filename, "*.json"):
            return ""
        repo_path = f"{self.organization}/{self.name}"
        return fetch_checksum_from_api(repo_path, filename)
```

Then in `_fetch_safetensors_metadata` you can rely on the same hook instead of calling `fetch_checksum_from_api` directly:

```python
self.model_hash = f"sha256:{self._fetch_checksum_for_file(self.model_filename)}"
```

That way, the base class remains the single place that “knows” when/where checksums are needed, and the subclass only defines how to get one, without duplicating extra rules in multiple methods.
</issue_to_address>

### Comment 3
<location> `ramalama/hf_style_repo_base.py:84` </location>
<code_context>
         self.model_hash = None
         self.mmproj_filename = None
         self.mmproj_hash = None
+        self.other_files: list[dict] = []
+        self.additional_safetensor_files: list[dict] = []
+        self.safetensors_index_file = None
</code_context>

<issue_to_address>
**issue (complexity):** Consider unifying extra file metadata and SnapshotFile construction so get_file_list becomes a simpler, declarative description of what to download rather than how to build each entry.

You can reduce the added complexity by (1) consolidating file state into a single structure and (2) centralizing `SnapshotFile` construction and checksum handling.

### 1. Consolidate `other_files` / `additional_safetensor_files` / `safetensors_index_file`

Instead of three separate attributes with slightly different semantics, use a single list of “extra” files with metadata. This keeps state in one place and makes `get_file_list` simpler:

```python
from dataclasses import dataclass
from enum import Enum

class ExtraFileKind(Enum):
    SAFETENSOR = "safetensor"
    OTHER = "other"
    INDEX = "index"

@dataclass
class ExtraFile:
    filename: str
    kind: ExtraFileKind
    required: bool = True
    should_verify_checksum: bool = True
    should_show_progress: bool = True

class BaseModel(ABC):
    def __init__(self, name: str, organization: str, tag: str = 'latest'):
        ...
        self.extra_files: list[ExtraFile] = []
        self.fetch_metadata()
```

Your existing `additional_safetensor_files`, `other_files`, and `safetensors_index_file` assignments in `fetch_metadata()` can be converted into `ExtraFile` instances and appended to `self.extra_files` instead of populating multiple attributes.

### 2. Extract `SnapshotFile` construction into a helper

`get_file_list` currently has multiple, almost-identical `SnapshotFile(...)` constructions. Pull that into a single helper that also standardizes checksum handling:

```python
def _snapshot_file_for(
    self,
    filename: str,
    file_type: SnapshotFileType,
    required: bool = True,
    should_show_progress: bool = True,
    should_verify_checksum: bool = True,
    use_custom_checksum: bool = True,
) -> SnapshotFile:
    checksum = ""
    if should_verify_checksum and use_custom_checksum:
        checksum = self._fetch_checksum_for_file(filename)
    elif should_verify_checksum and not use_custom_checksum:
        checksum = generate_sha256(filename)

    return SnapshotFile(
        url=f"{self.blob_url}/{filename}",
        header=self.headers,
        hash=f"sha256:{checksum}" if checksum else "",
        type=file_type,
        name=filename,
        required=required,
        should_show_progress=should_show_progress,
        should_verify_checksum=bool(checksum) and should_verify_checksum,
    )
```

Then `get_file_list` can be simplified and made more uniform:

```python
def get_file_list(self, cached_files: list[str]) -> list[SnapshotFile]:
    files: list[SnapshotFile] = []

    if self.model_filename not in cached_files:
        files.append(self.model_file())

    # existing mmproj / configs unchanged...

    for extra in self.extra_files:
        if extra.filename in cached_files:
            continue

        if extra.kind is ExtraFileKind.SAFETENSOR:
            file_type = SnapshotFileType.SafetensorModel
            use_custom_checksum = True
        else:
            file_type = SnapshotFileType.Other
            use_custom_checksum = extra.kind is not ExtraFileKind.INDEX

        files.append(
            self._snapshot_file_for(
                filename=extra.filename,
                file_type=file_type,
                required=extra.required,
                should_show_progress=extra.should_show_progress,
                should_verify_checksum=extra.should_verify_checksum,
                use_custom_checksum=use_custom_checksum,
            )
        )

    return files
```

This removes the separate loops for `additional_safetensor_files`, `other_files`, and the index file, and it keeps the logic declarative: `extra_files` describe *what* to download; `_snapshot_file_for` describes *how*.

### 3. Normalize checksum handling

Instead of:

- `_fetch_checksum_for_file` used only for some files, and
- `self.model_hash` being set directly in subclasses, and
- `generate_sha256(f".../{index_file}")` as a special case,

route everything through `_fetch_checksum_for_file` and a single code path.

```python
def _fetch_checksum_for_file(self, filename: str) -> str:
    """Return checksum (without 'sha256:' prefix) for a given filename.
    Base impl returns empty string (no checksum). Subclasses override.
    """
    return ""
```

Then update `model_file` to use the same hook:

```python
def model_file(self) -> SnapshotFile:
    assert self.model_filename

    file_type = (
        SnapshotFileType.SafetensorModel
        if fnmatch.fnmatch(self.model_filename, "*.safetensors")
        else SnapshotFileType.GGUFModel
    )

    checksum = self._fetch_checksum_for_file(self.model_filename)
    hash_value = f"sha256:{checksum}" if checksum else self.model_hash

    return SnapshotFile(
        url=f"{self.blob_url}/{self.model_filename}",
        header=self.headers,
        hash=hash_value,
        type=file_type,
        name=self.model_filename,
        should_show_progress=True,
        should_verify_checksum=bool(hash_value),
    )
```

And for index files, make the “no real checksum” explicit instead of generating a path-based hash:

```python
# when constructing ExtraFile for the index
self.extra_files.append(
    ExtraFile(
        filename=self.safetensors_index_file,
        kind=ExtraFileKind.INDEX,
        required=False,
        should_verify_checksum=False,
        should_show_progress=False,
    )
)
```

With the above changes:

- State is centralized in `extra_files` rather than three attributes.
- `get_file_list` is shorter and reads as “what files we want” instead of “how to build each snapshot.”
- All checksums go through `_fetch_checksum_for_file` or are explicitly marked as “no checksum,” eliminating special-case hashing for the index file.
</issue_to_address>

### Comment 4
<location> `ramalama/hf_style_repo_base.py:162-174` </location>
<code_context>
        if self.safetensors_index_file:
            if self.safetensors_index_file not in cached_files:
                logger.debug(f"Adding safetensors index file: {self.safetensors_index_file}")
                files.append(
                    SnapshotFile(
                        url=f"{self.blob_url}/{self.safetensors_index_file}",
                        header=self.headers,
                        hash=generate_sha256(f"{self.organization}/{self.name}/{self.safetensors_index_file}"),
                        type=SnapshotFileType.Other,
                        name=self.safetensors_index_file,
                        required=False,
                    )
                )

</code_context>

<issue_to_address>
**suggestion (code-quality):** Merge nested if conditions ([`merge-nested-ifs`](https://docs.sourcery.ai/Reference/Rules-and-In-Line-Suggestions/Python/Default-Rules/merge-nested-ifs))

```suggestion
        if self.safetensors_index_file and self.safetensors_index_file not in cached_files:
            logger.debug(f"Adding safetensors index file: {self.safetensors_index_file}")
            files.append(
                SnapshotFile(
                    url=f"{self.blob_url}/{self.safetensors_index_file}",
                    header=self.headers,
                    hash=generate_sha256(f"{self.organization}/{self.name}/{self.safetensors_index_file}"),
                    type=SnapshotFileType.Other,
                    name=self.safetensors_index_file,
                    required=False,
                )
            )

```

<br/><details><summary>Explanation</summary>Too much nesting can make code difficult to understand, and this is especially
true in Python, where there are no brackets to help out with the delineation of
different nesting levels.

Reading deeply nested code is confusing, since you have to keep track of which
conditions relate to which levels. We therefore strive to reduce nesting where
possible, and the situation where two `if` conditions can be combined using
`and` is an easy win.
</details>
</issue_to_address>

Sourcery is free for open source - if you like our reviews please consider sharing them ✨

_{Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.}

sourcery-ai · 2025-12-08T13:11:11Z

ramalama/transports/huggingface.py

+def fetch_repo_files(repo_name: str, revision: str = "main"):
+    """Fetch the list of files in a HuggingFace repository using the Files API."""
+    token = huggingface_token()
+    api_url = f"https://huggingface.co/api/models/{repo_name}/tree/{revision}"
+    logger.debug(f"Fetching repo files from {api_url}")
+
+    request = urllib.request.Request(
+        url=api_url,
+        headers={
+            'Accept': 'application/json',


issue: The HuggingFace Files API call does not handle pagination, which can truncate results for large repos.

This endpoint paginates results, so this implementation will only ever see the first page and may miss .safetensors or other files in larger models. Please either support pagination (e.g., following any cursor/next fields) or clearly document that this helper is only safe for small repos and enforce that assumption in callers.

ramalama/hf_style_repo_base.py

gemini-code-assist

Code Review

This pull request is a great step forward in improving how models are downloaded from Hugging Face, by moving from the CLI to direct HTTPS requests via the API. This removes an external dependency and adds support for safetensors models, which is a significant enhancement.

My review focuses on a few areas to improve robustness and clarity. I've found a critical issue related to hash generation that could lead to cache corruption, and I've provided a detailed suggestion to fix it. I also have a couple of smaller suggestions regarding type hinting and file filtering logic to make the code more maintainable and prevent potential issues.

Overall, the changes are well-structured and address an important feature. With a few adjustments, this will be a solid contribution.

gemini-code-assist · 2025-12-08T13:11:30Z

ramalama/hf_style_repo_base.py

+            if filename not in cached_files:
+                logger.debug(f"Adding other safetensors file: {filename}")
+                files.append(
+                    SnapshotFile(
+                        url=f"{self.blob_url}/{filename}",
+                        header=self.headers,
+                        hash=f"sha256:{self._fetch_checksum_for_file(filename)}",
+                        type=SnapshotFileType.Other,
+                        name=filename,
+                        should_show_progress=False,
+                        should_verify_checksum=False,
+                    )
+                )


There's a critical bug here that can lead to file corruption in the cache. When _fetch_checksum_for_file(filename) returns an empty string (e.g., for JSON files), the hash becomes "sha256:". This causes all such files to collide and overwrite each other in the blob store.

Additionally, the log message Adding other safetensors file is misleading as these files aren't necessarily safetensors.

To fix this, you should generate a unique hash for files without an API-provided checksum (similar to config files) and correct the log message.

Suggested change

if filename not in cached_files:

logger.debug(f"Adding other safetensors file: {filename}")

files.append(

SnapshotFile(

url=f"{self.blob_url}/{filename}",

header=self.headers,

hash=f"sha256:{self._fetch_checksum_for_file(filename)}",

type=SnapshotFileType.Other,

name=filename,

should_show_progress=False,

should_verify_checksum=False,

)

)

if filename not in cached_files:

logger.debug(f"Adding other file: {filename}")

checksum = self._fetch_checksum_for_file(filename)

file_hash = f"sha256:{checksum}" if checksum else generate_sha256(f"{self.organization}/{self.name}/{filename}")

files.append(

SnapshotFile(

url=f"{self.blob_url}/{filename}",

header=self.headers,

hash=file_hash,

type=SnapshotFileType.Other,

name=filename,

should_show_progress=False,

should_verify_checksum=False,

)

)

The whole checksum system needs to be updated. Files that are lfs blobs should use the blob sha signature. Other files should either hash the contents, or hash the oid, or use the oid directly (which would require relaxing the restriction that hash has to be 256 bits).

@jwieleRH Isn't the oid provided in the response when fetching the file list? Can't we use that one instead of re-fetching?

ping @jwieleRH

gemini-code-assist · 2025-12-08T13:11:30Z

ramalama/hf_style_repo_base.py

        self.mmproj_hash = None
+        self.other_files: list[dict] = []
+        self.additional_safetensor_files: list[dict] = []
+        self.safetensors_index_file = None


For better type safety and readability, consider adding a type hint for safetensors_index_file. Based on its usage, str | None would be appropriate.

Suggested change

self.safetensors_index_file = None

self.safetensors_index_file: str | None = None

Type hints are being gradually applied everywhere.

gemini-code-assist · 2025-12-08T13:11:30Z

ramalama/transports/huggingface.py

+            elif fnmatch.fnmatch(path, '*.txt'):
+                continue


The logic to unconditionally skip all *.txt files seems a bit broad. While this might be intended to skip LICENSE or other text files, it could also unintentionally skip important files like requirements.txt which might be necessary for some models. Could you clarify the reasoning for this or perhaps make the filtering more specific to the files you intend to exclude?

This would be addressed by fixing the checksum system.

rhatdan · 2025-12-08T13:32:00Z

@olliewalsh @engelmi PTAL

engelmi

A few comments, but overall looks good. Nice work!

engelmi · 2025-12-08T14:46:43Z

ramalama/transports/huggingface.py

+            path = file_info['path']
+            logger.debug(f"Examining file {path}")
+
+            if fnmatch.fnmatch(path, '*.safetensors'):


A simple path.endswith(".safetenstors") seems sufficient and would remove the dependency on fnmatch (which I am not sure is usable on windows?)

Personally, I would rather use fnmatch or os.path.splitext, both of which are explicitly about path names and work fine on Windows, while endswith is about strings. As a practical matter, splitext may for all I know be implemented by an implicit endswith. So I don't feel strongly about it.

fnmatch isn't path-aware, though, and on Windows there are some differences such as case-insensitivity and the path separators \ instead of /.
Was talking with @telemaco about using pathlib.Path everywhere so we have the semantics of paths everywhere - currently its pretty mixed. In this case it would be a simple p.suffix.lower() == ".safetensors"

if fact you could use path.glob("*.safetensors") . The .glob() method is supported by Pathlib

ramalama/transports/huggingface.py

engelmi · 2025-12-08T14:56:35Z

ramalama/hf_style_repo_base.py

+            if filename not in cached_files:
+                logger.debug(f"Adding other safetensors file: {filename}")
+                files.append(
+                    SnapshotFile(
+                        url=f"{self.blob_url}/{filename}",
+                        header=self.headers,
+                        hash=f"sha256:{self._fetch_checksum_for_file(filename)}",
+                        type=SnapshotFileType.Other,
+                        name=filename,
+                        should_show_progress=False,
+                        should_verify_checksum=False,
+                    )
+                )


@jwieleRH Isn't the oid provided in the response when fetching the file list? Can't we use that one instead of re-fetching?

engelmi · 2025-12-08T14:57:39Z

ramalama/hf_style_repo_base.py

+        # Determine file type based on extension
+        if fnmatch.fnmatch(self.model_filename, '*.safetensors'):
+            file_type = SnapshotFileType.SafetensorModel
+        else:
+            file_type = SnapshotFileType.GGUFModel


NIT: I'd remove the else and just default to gguf:

Suggested change

# Determine file type based on extension

if fnmatch.fnmatch(self.model_filename, '*.safetensors'):

file_type = SnapshotFileType.SafetensorModel

else:

file_type = SnapshotFileType.GGUFModel

# Determine file type based on extension

file_type = SnapshotFileType.GGUFModel

if fnmatch.fnmatch(self.model_filename, '*.safetensors'):

file_type = SnapshotFileType.SafetensorModel

That makes a certain amount of sense because we do look for GGUF first.

mikebonnet · 2025-12-11T15:21:38Z

/ok-to-test

Refactor the fetch_metadata method to attempt to fetch GGUF metadata from a manifest, and then to attempt to fetch safetensors metadata from the repo files. To fetch a safetensors model, download the set of files from the repo with https requests. Add an "other_files" category for safetensors files that are neither JSON config files nor .safetensors model files, such as the tokenizer.model file. Remove code to download models from huggingface.co with the huggingface cli. It didn't work correctly anyway. Stub out the _collect_cli_files and in_existing_cache methods for the huggingface transport. Enable bats test of downloading safetensors models from huggingface. Remove bats test of downloading models with the huggingface cli. Adjust test expectation for split safetensors model. Fixes: containers#1493 Signed-off-by: John Wiele <jwiele@redhat.com>

jwieleRH · 2025-12-19T04:46:44Z

/gemini review

gemini-code-assist

Code Review

This pull request effectively refactors the Hugging Face transport to support downloading safetensors models via HTTPS, which is a great improvement. The fallback mechanism from GGUF manifest to safetensors file discovery is well-implemented. The removal of the broken huggingface-cli support cleans up the codebase. My main feedback is a suggestion to improve the consistency of handling the safetensors index file by not discarding its oid, which would align its processing with that of other repository files.

ramalama/transports/huggingface.py

jwieleRH requested review from bmahabirbu, cgruver, engelmi, jhjaggars, maxamillion, rhatdan and swarajpande5 as code owners December 8, 2025 13:09

sourcery-ai bot approved these changes Dec 8, 2025

View reviewed changes

gemini-code-assist bot reviewed Dec 8, 2025

View reviewed changes

engelmi reviewed Dec 8, 2025

View reviewed changes

jwieleRH force-pushed the hf2 branch from 8092948 to e5d27ef Compare December 19, 2025 01:56

gemini-code-assist bot reviewed Dec 19, 2025

View reviewed changes

ramalama/transports/huggingface.py Show resolved Hide resolved

	self.safetensors_index_file = None
	self.safetensors_index_file: str \| None = None

Download safetensors models from huggingface.co with https. #2224

Are you sure you want to change the base?

Download safetensors models from huggingface.co with https. #2224

Uh oh!

Conversation

jwieleRH commented Dec 8, 2025 • edited by sourcery-ai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by Sourcery

Uh oh!

sourcery-ai bot commented Dec 8, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Reviewer's Guide

File-Level Changes

Assessment against linked issues

Possibly linked issues

Interacting with Sourcery

Customizing Your Experience

Getting Help

Uh oh!

gemini-code-assist bot commented Dec 8, 2025

Summary of Changes

Highlights

Footnotes

Uh oh!

sourcery-ai bot left a comment

Choose a reason for hiding this comment

Uh oh!

sourcery-ai bot Dec 8, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist bot Dec 8, 2025

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Dec 8, 2025

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Dec 8, 2025

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

rhatdan commented Dec 8, 2025

Uh oh!

engelmi left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

mikebonnet commented Dec 11, 2025

Uh oh!

jwieleRH commented Dec 19, 2025

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

jwieleRH commented Dec 8, 2025 •

edited by sourcery-ai bot

Loading

sourcery-ai bot commented Dec 8, 2025 •

edited

Loading