- Updated Tokenizers to 0.21.1
- Updated Tokenizers to 0.21.0
- Added support for Ruby 3.4
- Added
AddedToken
class - Added precompiled gem for Windows
- Added
from_str
method toTokenizer
- Added
model
andmodel=
methods toTokenizer
- Added
decoder
,pre_tokenizer
,post_processor
, andnormalizer
methods toTokenizer
- Added
decode
method toDecoder
- Updated Tokenizers to 0.20.0
- Added precompiled gem for Linux ARM MUSL
- Updated Tokenizers to 0.19.1
- Replaced
add_prefix_space
withprepend_scheme
andsplit
options forMetaspace
decoder and pre-tokenizer - Dropped support for Ruby < 3.1
- Updated Tokenizers to 0.15.2
- Added support for Ruby 3.3
- Updated Tokenizers to 0.15.0
- Fixed issue with download caching
- Fixed error loading gem
- Updated Tokenizers to 0.14.0
- Dropped support for Ruby < 3
- Updated Tokenizers to 0.13.3
- Added
ByteFallback
,Fuse
,Replace
, andStrip
decoders - Added
Prepend
normalizer
- Added precompiled gem for Linux x86-64 MUSL
- Fixed error with Ruby 2.7
- Added support for training tokenizers
- Added more methods to
Tokenizer
- Added
encode_batch
method toEncoding
- Added
pair
argument toencode
method - Changed
encode
method to include special tokens by default - Changed how offsets are calculated for strings with multibyte characters
- Added
add_special_tokens
option toencode
method - Added warning about
encode
method including special tokens by default in 0.3.0 - Added more methods to
Encoding
- Fixed error with precompiled gem on Mac ARM
- Added precompiled gem for Linux ARM
- Added
from_file
method - Fixed error with precompiled gem on Linux x86-64
- Added support for Ruby 3.2
- Added precompiled gems for Linux x86-64 and Mac
- Switched to
rb_sys
gem for building extension - Updated Tokenizers to 0.13.2
- Updated Rust edition to 2021
- Updated Tokenizers to 0.13.1
- Fixed error with installation on Linux
- Fixed error with installation
- First release