Closed
Description
Is your feature request related to a problem or challenge?
Some S3 public buckets, such as the clickbench public datasets bucket, do not require authentication
Other engines like ClickBench allow you to access these without providing any credentials: https://clickhouse.com/docs/engines/table-engines/integrations/s3
CREATE TABLE s3_engine_table (name String, value UInt32)
ENGINE=S3('s3://clickhouse-public-datasets/hits_compatible/hits.parquet', 'CSV', 'gzip')
However, datafusion-cli requires you to provide credentials in this case:
datafusion-cli
DataFusion CLI v47.0.0
> CREATE EXTERNAL TABLE hits
STORED AS PARQUET LOCATION 's3://clickhouse-public-datasets/hits_compatible/hits.parquet' OPTIONS(aws.region 'eu-west-1');
Object Store error: Generic S3 error: the credential provider was not enabled
Describe the solution you'd like
I would like the ability to access the public datasets without providing credentials
This is supported via this setting in the underlying builder: https://docs.rs/object_store/0.12.0/object_store/aws/struct.AmazonS3Builder.html#method.with_skip_signature
Describe alternatives you've considered
I would like to be able to do
> CREATE EXTERNAL TABLE hits
STORED AS PARQUET LOCATION 's3://clickhouse-public-datasets/hits_compatible/hits.parquet' OPTIONS(aws.skip_signature true, aws.region 'eu-central-1');
And maybe also this (without any signature at all)
> CREATE EXTERNAL TABLE hits
STORED AS PARQUET LOCATION 's3://clickhouse-public-datasets/hits_compatible/hits.parquet' OPTIONS(aws.region 'eu-central-1');
Additional context
No response