Skip to content

write_deltalake fails on Databricks volume #2540

Closed as not planned
Closed as not planned
@Bernolt

Description

@Bernolt

Environment

Delta-rs version: 0.17.4

Binding: python (pyarrow engine)

Environment:

  • Cloud provider: Azure
  • OS:
  • Other: Databricks runtime 13.3 LTS

Bug

What happened:
From a python application running on a Databricks cluster, I want to write to an append-only delta table.
The function is called as follows:

write_deltalake(
      data=arrow_table,
      table_or_uri="/Volume/catalog/schema/volume_path/table_path",
      mode="append",
      overwrite_schema=False)

However, I am getting the below error:

OSError: Generic LocalFileSystem error: Unable to copy file from /Volumes/catalog/schema/volume_path/table_path/_delta_log/_commit_e964ab56-f56c-403a-b06d-fe2b6bcabf9d.json.tmp to /Volumes/catalog/schema/volume_path/table_path/_delta_log/00000000000000000000.json: Function not implemented (os error 38)

What you expected to happen:
As Databricks supports copy/rename/delete operations, I would expect it to work.
As far as I know Databricks use a Local File System API, which emulates a filesystem on top of a cloud storage.

How to reproduce it:
I made the below notebook to reproduce the error. It needs to be run from a Databricks Runtime.

# Databricks notebook source
# MAGIC %sh
# MAGIC touch /Volumes/catalog/schema/volume/table_path/to_rename.tmp

# COMMAND ----------

# MAGIC %sh
# MAGIC mv /Volumes/catalog/schema/volume/table_path/to_rename.tmp /Volumes/catalog/schema/volume/table_path/renamed.todelete

# COMMAND ----------

# MAGIC %sh 
# MAGIC rm /Volumes/catalog/schema/volume/table_path/renamed.todelete

# COMMAND ----------

from deltalake import write_deltalake
import pyarrow as pa

# COMMAND ----------

arrow_table = pa.table([
    pa.array([2, 4, 5, 100]),
    pa.array(["Flamingo", "Horse", "Brittle stars", "Centipede"])
    ], names=['n_legs', 'animals'])

# COMMAND ----------

write_deltalake(table_or_uri = "/Volumes/catalog/schema/volume/table_path/reproduce_deltars_error_table_01",
                data = arrow_table,
                mode = "append",
                overwrite_schema=False)

More details:

Activity

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions