-
Notifications
You must be signed in to change notification settings - Fork 525
Closed
Labels
enhancementNew feature or requestNew feature or request
Description
Description
For both the datafusion and pyarrow-based writers to support writer protocol v2, we'll need to support enforcing invariants. It seems like the following signature could be reused by both implementations:
fn enforce_invariant(batch: RecordBatch, invariants: &Vec<(i32, &str)>) -> Result<(), DatafusionError> {
// rough implementation:
for (column_index, sql_invariant) in invariants {
// ... (run data fusion query)
// ... If failure, return error indicating which invariant failed.
}
Ok(())
}
Then this function could be applied to each record batch that comes in during a write.
We might also need to check whether the column is nullable and make sure we are enforcing that too, either as part of this or part of the schema enforcement. Should add a test for that.
What do you think @roeap?
Related Issue(s)
Related docs
https://github.com/delta-io/delta/blob/master/PROTOCOL.md#column-invariants
https://books.japila.pl/delta-lake-internals/constraints/Invariants/
roeap
Metadata
Metadata
Assignees
Labels
enhancementNew feature or requestNew feature or request