Closed as not planned
Description
Environment
Delta-rs version: 0.17.4
Binding: python (pyarrow engine)
Environment:
- Cloud provider: Azure
- OS:
- Other: Databricks runtime 13.3 LTS
Bug
What happened:
From a python application running on a Databricks cluster, I want to write to an append-only delta table.
The function is called as follows:
write_deltalake(
data=arrow_table,
table_or_uri="/Volume/catalog/schema/volume_path/table_path",
mode="append",
overwrite_schema=False)
However, I am getting the below error:
OSError: Generic LocalFileSystem error: Unable to copy file from /Volumes/catalog/schema/volume_path/table_path/_delta_log/_commit_e964ab56-f56c-403a-b06d-fe2b6bcabf9d.json.tmp to /Volumes/catalog/schema/volume_path/table_path/_delta_log/00000000000000000000.json: Function not implemented (os error 38)
What you expected to happen:
As Databricks supports copy/rename/delete operations, I would expect it to work.
As far as I know Databricks use a Local File System API, which emulates a filesystem on top of a cloud storage.
How to reproduce it:
I made the below notebook to reproduce the error. It needs to be run from a Databricks Runtime.
# Databricks notebook source
# MAGIC %sh
# MAGIC touch /Volumes/catalog/schema/volume/table_path/to_rename.tmp
# COMMAND ----------
# MAGIC %sh
# MAGIC mv /Volumes/catalog/schema/volume/table_path/to_rename.tmp /Volumes/catalog/schema/volume/table_path/renamed.todelete
# COMMAND ----------
# MAGIC %sh
# MAGIC rm /Volumes/catalog/schema/volume/table_path/renamed.todelete
# COMMAND ----------
from deltalake import write_deltalake
import pyarrow as pa
# COMMAND ----------
arrow_table = pa.table([
pa.array([2, 4, 5, 100]),
pa.array(["Flamingo", "Horse", "Brittle stars", "Centipede"])
], names=['n_legs', 'animals'])
# COMMAND ----------
write_deltalake(table_or_uri = "/Volumes/catalog/schema/volume/table_path/reproduce_deltars_error_table_01",
data = arrow_table,
mode = "append",
overwrite_schema=False)
More details:
Activity