Skip to content
Permalink

Comparing changes

Choose two branches to see what’s changed or to start a new pull request. If you need to, you can also or learn more about diff comparisons.

Open a pull request

Create a new pull request by comparing changes across two branches. If you need to, you can also . Learn more about diff comparisons here.
base repository: huggingface/tokenizers
Failed to load repositories. Confirm that selected base ref is valid, then try again.
Loading
base: v0.21.4
Choose a base ref
...
head repository: huggingface/tokenizers
Failed to load repositories. Confirm that selected head ref is valid, then try again.
Loading
compare: v0.22.1
Choose a head ref
  • 20 commits
  • 26 files changed
  • 14 contributors

Commits on Jul 22, 2025

  1. Bump on-headers and compression (#1827)

    ---
    updated-dependencies:
    - dependency-name: on-headers
      dependency-version: 1.1.0
      dependency-type: indirect
    - dependency-name: compression
      dependency-version: 1.8.1
      dependency-type: indirect
    ...
    
    Signed-off-by: dependabot[bot] <support@github.com>
    Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
    dependabot[bot] authored Jul 22, 2025
    Configuration menu
    Copy the full SHA
    9164247 View commit details
    Browse the repository at this point in the history

Commits on Jul 29, 2025

  1. Implement from_bytes and read_bytes Methods in WordPiece Tokenize…

    …r for WebAssembly Compatibility (#1758)
    
    * Add from_bytes and read_bytes method to WordPiece
    
    * Change wordpiece  method return value
    
    ---------
    
    Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
    sondalex and ArthurZucker authored Jul 29, 2025
    Configuration menu
    Copy the full SHA
    ed2cda5 View commit details
    Browse the repository at this point in the history

Commits on Aug 27, 2025

  1. Configuration menu
    Copy the full SHA
    95b882a View commit details
    Browse the repository at this point in the history

Commits on Aug 29, 2025

  1. New stream (#1856)

    * update
    
    * update
    
    * updates
    
    * up
    
    * oikay
    
    * use stream input
    
    * nice all test pass?
    
    * fmt
    
    * dev
    
    * rename
    
    * simplify a hell lot
    
    * proper testing
    
    * fix inti
    
    * fix test
    
    * nits
    
    * make clippy happy now
    
    * fmt fml
    
    * remove the prints
    
    * fix gate
    ArthurZucker authored Aug 29, 2025
    Configuration menu
    Copy the full SHA
    abee958 View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    49a0907 View commit details
    Browse the repository at this point in the history
  3. Configuration menu
    Copy the full SHA
    b0464b2 View commit details
    Browse the repository at this point in the history
  4. Update quicktour.mdx re: Issue #1625 (#1846)

    Update broken wikitext-103 and tokenizers-pipeline links
    WilliamPLaCroix authored Aug 29, 2025
    Configuration menu
    Copy the full SHA
    ec54228 View commit details
    Browse the repository at this point in the history
  5. Configuration menu
    Copy the full SHA
    c5eb93f View commit details
    Browse the repository at this point in the history
  6. Fix typo in README (#1808)

    aisk authored Aug 29, 2025
    Configuration menu
    Copy the full SHA
    7c01a4f View commit details
    Browse the repository at this point in the history
  7. RUSTSEC-2024-0436 - replace paste with pastey (#1834)

    Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
    nystromjd and ArthurZucker authored Aug 29, 2025
    Configuration menu
    Copy the full SHA
    b43d8d7 View commit details
    Browse the repository at this point in the history
  8. Tokenizer: Add native async bindings, via py03-async-runtimes. (#1843)

    * add async bindings
    
    * update based on review!
    
    * us hf internal testing for testing
    
    * reduce burden for the CI
    
    * asyn is not necessarily fast
    
    * remove comments
    
    ---------
    
    Co-authored-by: Arthur <arthur.zucker@gmail.com>
    Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
    3 people authored Aug 29, 2025
    Configuration menu
    Copy the full SHA
    bd1149c View commit details
    Browse the repository at this point in the history
  9. Configuration menu
    Copy the full SHA
    9bafd82 View commit details
    Browse the repository at this point in the history
  10. Configuration menu
    Copy the full SHA
    da1cc3b View commit details
    Browse the repository at this point in the history

Commits on Aug 30, 2025

  1. Configuration menu
    Copy the full SHA
    57eb8d7 View commit details
    Browse the repository at this point in the history
  2. Nit rendering the doc

    ArthurZucker authored Aug 30, 2025
    Configuration menu
    Copy the full SHA
    7b02178 View commit details
    Browse the repository at this point in the history
  3. Configuration menu
    Copy the full SHA
    c91d76a View commit details
    Browse the repository at this point in the history

Commits on Sep 16, 2025

  1. chore(trainer): add and improve trainer signature (#1838)

    * chore(trainers): add __init__ to fix python type check errors
    
    * restore
    
    * chore(trainer): add and improve trainer signature
    
    * clean fix
    
    * chore(fmt): fix cargo fmt error
    
    ---------
    
    Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
    shenxiangzhuang and ArthurZucker authored Sep 16, 2025
    Configuration menu
    Copy the full SHA
    c0d3697 View commit details
    Browse the repository at this point in the history

Commits on Sep 19, 2025

  1. Bump huggingface_hub upper version (#1866)

    * Test hfh 1.0 rc0
    
    * Tokenizers works on both 0.x and 1.x versions
    Wauplin authored Sep 19, 2025
    Configuration menu
    Copy the full SHA
    972e7fc View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    6cbd461 View commit details
    Browse the repository at this point in the history
  3. push the minor

    ArthurZucker committed Sep 19, 2025
    Configuration menu
    Copy the full SHA
    afaae08 View commit details
    Browse the repository at this point in the history
Loading