_get_files doesn't return files in a deterministic order across OSes #239
Open
Description
_get_files
in local.data.transforms.py doesn't return files in a deterministic order across OSes.
This is an issue when getting files, then splitting using a fixed seed. For example, in 08_pets_tutorial.ipynb (I added the seed parameter):
items = get_image_files(source)
split_idx = RandomSplitter(seed=42)(items)
In this case, 2 users on different OSes would have the same split_idx
, but different train/validation sets.
It would be straightforward for a user to correct this by sorting items
before passing this list into the splitter, but I wouldn't expect that many people would know to do this.
Metadata
Assignees
Labels
No labels