Skip to content

Python: write_deltalake to ADLS Gen2 issue #1456

Closed
@Dammi87

Description

@Dammi87

Environment

Windows 10
Python 3.10.11

Delta-rs version:
deltalake 0.9.0
pyarrow 12.0.0
numpy 1.24.3

Binding:
N/A

Environment:

  • Cloud provider: Azure
  • OS: Windows
  • Other:

Bug

What happened:
When trying to write a delta-table to Azure DataLake Gen2 a issue occurs regarding the provided storage options. I've verified that the account key is definitely correct. Here is the code snippet that causes the issue

from deltalake import write_deltalake

fs = {
 'account_name':"mdieuwcoldpathcpdl",
 'account_key': "***"
}

write_deltalake('https://mdieuwcoldpathcpdl.dfs.core.windows.net/delta/bar', data=table, storage_options=fs)

And the stack trace is

---------------------------------------------------------------------------
PyDeltaTableError                         Traceback (most recent call last)
Cell In[14], line 9
      1 from deltalake import write_deltalake
      3 fs = {
      5  'account_name':"***",
      6  'account_key': "***"
      7 }
----> 9 write_deltalake('https://***.dfs.core.windows.net/delta/bar', data=table, storage_options=fs)

File [c:\devops\md-coldpath-preprocessing\.conda\lib\site-packages\deltalake\writer.py:147](file:///C:/devops/md-coldpath-preprocessing/.conda/lib/site-packages/deltalake/writer.py:147), in write_deltalake(table_or_uri, data, schema, partition_by, filesystem, mode, file_options, max_partitions, max_open_files, max_rows_per_file, min_rows_per_group, max_rows_per_group, name, description, configuration, overwrite_schema, storage_options, partition_filters)
    144     else:
    145         data, schema = delta_arrow_schema_from_pandas(data)
--> 147 table, table_uri = try_get_table_and_table_uri(table_or_uri, storage_options)
    149 # We need to write against the latest table version
    150 if table:

File [c:\devops\md-coldpath-preprocessing\.conda\lib\site-packages\deltalake\writer.py:392](file:///C:/devops/md-coldpath-preprocessing/.conda/lib/site-packages/deltalake/writer.py:392), in try_get_table_and_table_uri(table_or_uri, storage_options)
    389     raise ValueError("table_or_uri must be a str, Path or DeltaTable")
    391 if isinstance(table_or_uri, (str, Path)):
--> 392     table = try_get_deltatable(table_or_uri, storage_options)
    393     table_uri = str(table_or_uri)
    394 else:

File [c:\devops\md-coldpath-preprocessing\.conda\lib\site-packages\deltalake\writer.py:405](file:///C:/devops/md-coldpath-preprocessing/.conda/lib/site-packages/deltalake/writer.py:405), in try_get_deltatable(table_uri, storage_options)
    401 def try_get_deltatable(
    402     table_uri: Union[str, Path], storage_options: Optional[Dict[str, str]]
    403 ) -> Optional[DeltaTable]:
    404     try:
--> 405         return DeltaTable(table_uri, storage_options=storage_options)
    406     except PyDeltaTableError as err:
    407         # TODO: There has got to be a better way...
    408         if "Not a Delta table" in str(err):

File [c:\devops\md-coldpath-preprocessing\.conda\lib\site-packages\deltalake\table.py:122](file:///C:/devops/md-coldpath-preprocessing/.conda/lib/site-packages/deltalake/table.py:122), in DeltaTable.__init__(self, table_uri, version, storage_options, without_files)
    109 """
    110 Create the Delta Table from a path with an optional version.
    111 Multiple StorageBackends are currently supported: AWS S3, Azure Data Lake Storage Gen2, Google Cloud Storage (GCS) and local URI.
   (...)
    119                       DeltaTable will be loaded with a significant memory reduction.
    120 """
    121 self._storage_options = storage_options
--> 122 self._table = RawDeltaTable(
    123     str(table_uri),
    124     version=version,
    125     storage_options=storage_options,
    126     without_files=without_files,
    127 )
    128 self._metadata = Metadata(self._table)

PyDeltaTableError: Failed to read delta log object: Generic MicrosoftAzure error: Container name must be specified

I tried also adding the container_name keyword to the fs dictionary with the same issue.

What you expected to happen:
I expected a delta table to be created.

How to reproduce it:

More details:

Activity

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions