Description
Environment
Delta-rs version:
0.7.0
Binding:
Environment: Local Machine
- OS: MacOS
- Other:
M1 chip
Bug
What happened:
When using DeltaOps(table).load().await?;
, this returned the following error Error: InvalidTableLocation(...)
. Script is written below for a more detailed example.
Under the hood, this seems to be the fault of passing the stripped delta table path (removing file://
) into the url::parse
command without the file://
prefix as this returns a Err(RelativeUrlWithoutBase)
.
What you expected to happen:
I expected this to successfully load the delta table as it passed the following check assert!(table.object_store().is_delta_table_location().await?);
and every other command (like getting the metadata) seems to work, AFAIK.
I cloned the repo and removed the substitution (i.e. .replace("file://", "")
) and then I no longer got this error. But this causes a couple of tests to break.
How to reproduce it:
Adapted script from one of the examples
use arrow::{
array::{Int32Array, StringArray},
datatypes::{DataType, Field, Schema as ArrowSchema},
record_batch::RecordBatch,
};
use deltalake::{
action::Protocol, arrow, operations::collect_sendable_stream, DeltaTableBuilder,
DeltaTableMetaData, Schema,
};
use deltalake::{DeltaOps, DeltaTable, SchemaDataType, SchemaField};
use futures::executor;
use std::{collections::HashMap, sync::Arc};
fn get_table_columns() -> Vec<SchemaField> {
vec![
SchemaField::new(
String::from("int"),
SchemaDataType::primitive(String::from("integer")),
false,
Default::default(),
),
SchemaField::new(
String::from("string"),
SchemaDataType::primitive(String::from("string")),
true,
Default::default(),
),
]
}
async fn init_delta_table(table_path: &str) -> DeltaTable {
let metadata = DeltaTableMetaData::new(
None,
None,
None,
Schema::new(get_table_columns()),
vec![],
HashMap::new(),
);
let mut table = DeltaTableBuilder::from_uri(table_path).build().unwrap();
table
.create(metadata, Protocol::default(), None, None)
.await
.unwrap();
table
}
#[tokio::main(flavor = "current_thread")]
async fn main() -> Result<(), deltalake::DeltaTableError> {
let table_path = "file:///Users/brs/rust_projs/delta-lake-playground/data/delta-test/";
let table = deltalake::open_table(table_path)
.await
.unwrap_or_else(|_| executor::block_on(init_delta_table(&table_path)));
// let table = DeltaOps::new_in_memory()
// .create()
// .with_columns(get_table_columns())
// .await?;
// let batch = get_table_batches();
// let table = DeltaOps(table).write(vec![batch.clone()]).await?;
// let (table, _check) = DeltaOps(table).filesystem_check().await?;
assert!(table.object_store().is_delta_table_location().await?);
let test = url::Url::parse(&table.object_store().root_uri());
println!("{test:?}");
println!("{}", table.object_store().root_uri());
let test = url::Url::parse(&table_path);
println!("{test:?}");
// let test = url::Url::parse(&table_path);
let (_table, stream) = DeltaOps(table).load().await?;
let data: Vec<RecordBatch> = collect_sendable_stream(stream).await?;
println!("{:?}", data);
Ok(())
}
More details:
Activity