You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I'm the developer of WebDataset for PyTorch, a linearly scalable format, libraries, and server for PyTorch. WebDataset represents datasets as .tar archives of files on disk and allows access to them from any web server, object store, and cloud storage system. It's all open source, and we have demonstrated 1 Gbyte/s per GPU I/O speeds.
I'm the developer of WebDataset for PyTorch, a linearly scalable format, libraries, and server for PyTorch. WebDataset represents datasets as .tar archives of files on disk and allows access to them from any web server, object store, and cloud storage system. It's all open source, and we have demonstrated 1 Gbyte/s per GPU I/O speeds.
The PyTorch implementation is at github.com/tmbdev/webdataset; the server implementation is at github.com/nvidia/aistore.
I have recently implemented a multithreaded loader for Julia that can read the same format. You can find it at github.com/tmbdev/WebDataset.jl.
You might want to add this to the resources, as well as take it into account for DataLoaders.jl and FastAI.jl
(I work on very large scale machine learning problems, so my next step is to see how I can get multi-GPU and multinode training to work in Julia.)
The text was updated successfully, but these errors were encountered: