Closed
Description
Environment
Windows 10
Python 3.10.11
Delta-rs version:
deltalake 0.9.0
pyarrow 12.0.0
numpy 1.24.3
Binding:
N/A
Environment:
- Cloud provider: Azure
- OS: Windows
- Other:
Bug
What happened:
When trying to write a delta-table to Azure DataLake Gen2 a issue occurs regarding the provided storage options. I've verified that the account key is definitely correct. Here is the code snippet that causes the issue
from deltalake import write_deltalake
fs = {
'account_name':"mdieuwcoldpathcpdl",
'account_key': "***"
}
write_deltalake('https://mdieuwcoldpathcpdl.dfs.core.windows.net/delta/bar', data=table, storage_options=fs)
And the stack trace is
---------------------------------------------------------------------------
PyDeltaTableError Traceback (most recent call last)
Cell In[14], line 9
1 from deltalake import write_deltalake
3 fs = {
5 'account_name':"***",
6 'account_key': "***"
7 }
----> 9 write_deltalake('https://***.dfs.core.windows.net/delta/bar', data=table, storage_options=fs)
File [c:\devops\md-coldpath-preprocessing\.conda\lib\site-packages\deltalake\writer.py:147](file:///C:/devops/md-coldpath-preprocessing/.conda/lib/site-packages/deltalake/writer.py:147), in write_deltalake(table_or_uri, data, schema, partition_by, filesystem, mode, file_options, max_partitions, max_open_files, max_rows_per_file, min_rows_per_group, max_rows_per_group, name, description, configuration, overwrite_schema, storage_options, partition_filters)
144 else:
145 data, schema = delta_arrow_schema_from_pandas(data)
--> 147 table, table_uri = try_get_table_and_table_uri(table_or_uri, storage_options)
149 # We need to write against the latest table version
150 if table:
File [c:\devops\md-coldpath-preprocessing\.conda\lib\site-packages\deltalake\writer.py:392](file:///C:/devops/md-coldpath-preprocessing/.conda/lib/site-packages/deltalake/writer.py:392), in try_get_table_and_table_uri(table_or_uri, storage_options)
389 raise ValueError("table_or_uri must be a str, Path or DeltaTable")
391 if isinstance(table_or_uri, (str, Path)):
--> 392 table = try_get_deltatable(table_or_uri, storage_options)
393 table_uri = str(table_or_uri)
394 else:
File [c:\devops\md-coldpath-preprocessing\.conda\lib\site-packages\deltalake\writer.py:405](file:///C:/devops/md-coldpath-preprocessing/.conda/lib/site-packages/deltalake/writer.py:405), in try_get_deltatable(table_uri, storage_options)
401 def try_get_deltatable(
402 table_uri: Union[str, Path], storage_options: Optional[Dict[str, str]]
403 ) -> Optional[DeltaTable]:
404 try:
--> 405 return DeltaTable(table_uri, storage_options=storage_options)
406 except PyDeltaTableError as err:
407 # TODO: There has got to be a better way...
408 if "Not a Delta table" in str(err):
File [c:\devops\md-coldpath-preprocessing\.conda\lib\site-packages\deltalake\table.py:122](file:///C:/devops/md-coldpath-preprocessing/.conda/lib/site-packages/deltalake/table.py:122), in DeltaTable.__init__(self, table_uri, version, storage_options, without_files)
109 """
110 Create the Delta Table from a path with an optional version.
111 Multiple StorageBackends are currently supported: AWS S3, Azure Data Lake Storage Gen2, Google Cloud Storage (GCS) and local URI.
(...)
119 DeltaTable will be loaded with a significant memory reduction.
120 """
121 self._storage_options = storage_options
--> 122 self._table = RawDeltaTable(
123 str(table_uri),
124 version=version,
125 storage_options=storage_options,
126 without_files=without_files,
127 )
128 self._metadata = Metadata(self._table)
PyDeltaTableError: Failed to read delta log object: Generic MicrosoftAzure error: Container name must be specified
I tried also adding the container_name keyword to the fs dictionary with the same issue.
What you expected to happen:
I expected a delta table to be created.
How to reproduce it:
More details:
Activity