-
Notifications
You must be signed in to change notification settings - Fork 284
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Dynamic shape of lables. #774
Comments
make_reader and make_batch_reader are quite different. Please read more here https://github.com/uber/petastorm/#non-petastorm-parquet-stores and repost if you need further clarifications. The future warnings are a known issue. Hope to address it soon. |
I went over the documentation, my main problem is why dynamic size of a field is possible with make_reader but not with make_batch_reader.. |
If your data has non-uniform size, as you describe, and you use |
Hey,
I'm using petastorm for object detection, each image might have different number of objects in it.
While im using make_reader and specify inside TransfromSpec shape of (-1, 5) for the labels everything works fine,
But when im using make_batch_reader im getting an error about the shape. ( Tried (None, 5) too but still got an error)
Is there a way to specify dynamic size for some field?
And why there is a difference between make_reader and make_batch_reader?
Besides this, I'm getting a lot of future warning about pyarrow version ( working inside databricks environment)
Do you know have i can avoid all this warnings?
Hope you will be able to help me.
If any information is missing let me know.
petastorm version: 0.11.4
`future warnings example:
parquet_file = ParquetFile(self._dataset.fs.open(piece.path))
/databricks/python/lib/python3.9/site-packages/petastorm/py_dict_reader_worker.py:180: FutureWarning: 'ParquetDataset.partitions' attribute is deprecated as of pyarrow 5.0.0 and will be removed in a future version. Specify 'use_legacy_dataset=False' while constructing the ParquetDataset, and then use the '.partitioning' attribute instead.
databricks/python/lib/python3.9/site-packages/petastorm/fs_utils.py:88: FutureWarning: pyarrow.localfs is deprecated as of 2.0.0, please use pyarrow.fs.LocalFileSystem instead.
`
Thanks a lot!
The text was updated successfully, but these errors were encountered: