-
Notifications
You must be signed in to change notification settings - Fork 521
xet: Commit operation to edit part of file (optimized for handling edits at beginning of file) #1718
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update. |
coyotte508
added a commit
to huggingface/xet-core
that referenced
this pull request
Sep 6, 2025
…481) Related to huggingface/huggingface.js#1718 We'll want to edit parts of file while loading old data's dedup info In those case we don't always want to load dedup info for the first chunk (since it may not be at the beginning of the file) So the is_dedup = true for first chunk is handled client side
mishig25
added a commit
that referenced
this pull request
Sep 11, 2025
### Description This PR introduces function that serializes GGUF metadata/header into Uint8Arrray so that I can use #1718 to update gguf metadata on hf.co * `serializeTypedMetadata()` - Serialize GGUF metadata to binary format * `serializeGgufHeader()` - Create complete GGUF headers with metadata + tensor info + alignment * Enhanced `gguf()` function - Now returns `littleEndian` property for endianness detection ### Usage example ```ts // Edit first kB of file await commit({ repo, accessToken: "hf_...", operations: [{ type: "edit", originalContent: new Blob(original gguf header), edits: [{ start: 0, end: 1000, content: new Blob(serializeGgufHeader(new gguf header with updated metadata)) }] }] }) ```
mishig25
added a commit
that referenced
this pull request
Sep 13, 2025
…oken and related functions for better handling of pull requests (#1746) I was getting error when trying to create pull request on repos that I do **not** own: ``` Error: Forbidden: pass `create_pr=1` as a query parameter to create a Pull Request. URL: https://huggingface.co/api/models/reach-vb/TinyLlama-1.1B-Chat-v1.0-q4_k_m-GGUF/xet-write-token/main ``` I was using new API from #1718 ```ts // Edit first kB of file await commit({ repo, accessToken: "hf_...", operations: [{ type: "edit", originalContent: originalFile, edits: [{ start: 0, end: 1000, content: new Blob(["blablabla"]) }] }] }) ``` Let me know if this PR is the right way to handle this issue
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Fix #1705
cc @mishig25 @assafvayner @jsulz
also
How it works under the hood
Todo
currently blob is being processed twice, once for sha256 and once for hashing. The file should be processed only once (maybe after #1704 - using workers for different processes)