Newest updates break DaskOfflineStore with S3 parquets #4753
Open
Description
Expected Behavior
In version 0.40.1 the Dask Offline store was able to read the data_source.path directly from the FileSource and retrieve the data from S3 using a path like: s3://<your-bucket>/<file-name>
Current Behavior
Failing to pull data because it is now appending the repo_path to the front of the s3 url.
Example:
/tmp/feast:s3//<your-bucket>/<file-name>
I believe this is because of a recent change: #4624 which is now not accepting the S3 url as a absolute Path
Steps to reproduce
- Rebuilt my environment with latest tagged version
0.41.3
- Reran my
get_historical_features
and call hung for a while then errored with the file path error not existing
Specifications
- Version: 0.41.3
- Platform: Linux
- Subsystem: Debian
Possible Solution
- Revert that change or allow a flag that would be able to bypass that breaking change
- IF storage_options NOT None, Read parquet directly