Replies: 2 comments
-
|
This seems to be an issue with dill. Simply put, caching in huggingface relies on a library that sometimes fails. However the discussion I found is quite old, and it is hard for me to verify this is the problem. uqfoundation/dill#19 (comment) What I did was to check what datasets/src/datasets/fingerprint.py Line 188 in 53f958e as the What can I do to enforce reproducibility and cache utilization? |
Beta Was this translation helpful? Give feedback.
-
|
I will for the time being use the |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
I have tried to make a minimal reproduction, but have not really managed. So bear with me for a second.
I have a file
foo.pywith contentsIf I call this a few times with command line
rm -rf ~/.cache/huggingface/datasets/ && python foo.py && python foo.py, the output looks likeso clearly the caching mechanism fails intermittently. The second tqdm progress bar appears only when the cached versions of the
mapcall comes to an invvalidated cache. I need to understand why the caching fails.The contents of
lib.renderlooks likeThere are a couple of confusing things:
lib/render.pyfixes the problemHow can I debug the cache invalidation behavior? Where can I find exactly a description of the caching logic?
Beta Was this translation helpful? Give feedback.
All reactions