Releases: Lightning-AI/litdata
Releases · Lightning-AI/litdata
v0.2.41
What's Changed
- doc: improve dev doc by @deependujha in #488
- Expose optimize dns by @tchaton in #498
- Update
_get_folder_size
: Reduce Logs noise and switch toos.scandir
by @bhimrazy in #499 - Bump: version 0.2.41 by @deependujha in #500
Full Changelog: v0.2.40...v0.2.41
v0.2.40
What's Changed
- fix:
clean parquet dir cache
fixture by @deependujha in #474 - Fix: Allow using
Machine
types inmap
by @ethanwharris in #473 - 🛠️ Fix: Ensure
chunk_bytes
inindex.json
matches actual chunk file size by @bhimrazy in #478 - fix: _get_folder_size fn by @deependujha in #471
- Added boolean serialiser called by litdata.optimise() by @DominiquePaul in #481
- Doc: improve dev doc & add ToDos by @deependujha in #479
- upd: Add hf file download progress and update local file path by @bhimrazy in #484
- fix: segmentation fault error in streaming tokens by @bhimrazy in #485
- Warn user if
max_cache_size
is less than 25GB in StreamingDataset by @bhimrazy in #489 - fix: Properly assign the chunks to the right worker by @tchaton in #449
- Bump version to 0.2.40 by @bhimrazy in #491
New Contributors
- @ethanwharris made their first contribution in #473
- @DominiquePaul made their first contribution in #481
Full Changelog: v0.2.39...v0.2.40
Release 0.2.39
What's Changed
- Feat: add support for HuggingFace datasets by @deependujha in #462
- Using count-locks for multi-node-single-cache support by @JackUrb in #468
- Bump version to 0.2.39 by @tchaton in #470
Full Changelog: v0.2.38...v0.2.39
Release 0.2.38
What's Changed
- Adding upsampling support for StreamingDataset by @JackUrb in #453
- ci: bump utils to latest /
main
by @Borda in #457 - [WIP] Improve debugging of DDP hanging by @tchaton in #456
- Feat: Add support for parquet files by @deependujha in #443
- Nit: specify testpaths for pytest by @deependujha in #464
- Update the S3 client initialization to explicitly use
boto3.Session()
by @bhimrazy in #461 - Add more code owners by @tchaton in #465
- Refactor: Parquet support PR by @deependujha in #460
- Bump version 0.2.38 by @tchaton in #466
New Contributors
Full Changelog: v0.2.37...v0.2.38
Release 0.2.37
What's Changed
Full Changelog: v0.2.36...v0.2.37
Release 0.2.36
Release 0.2.35
What's Changed
- Update(ci): github action's artifact upgrade v4 by @alleeclark in #435
- [pre-commit.ci] pre-commit suggestions by @pre-commit-ci in #444
- Fix: progress bar for merge_datasets by @bhimrazy in #445
- Add support for s3 folders by @tchaton in #447
- Bump to version 0.2.35 by @tchaton in #448
New Contributors
- @alleeclark made their first contribution in #435
- @pre-commit-ci made their first contribution in #444
Full Changelog: v0.2.34...v0.2.35
Release 0.2.34
What's Changed
- Fix the serialization of scalar valued tensors by @enrico-stauss in #431
- Add example on how to filter illegal data by @tchaton in #432
Full Changelog: v0.2.33...v0.2.34
Release 0.2.33
What's Changed
- POC: add tiffile serializer by @robmarkcole in #425
- bump by @robmarkcole in #427
Full Changelog: v0.2.32...v0.2.33
Release 0.2.32
What's Changed
- fix: Add mechanism to inform the user a new version is available by @tchaton in #420
- Version 0.2.32 by @tchaton in #421
Full Changelog: v0.2.31...v0.2.32