Skip to content

Write enforce_invariant() function #592

@wjones127

Description

@wjones127

Description

For both the datafusion and pyarrow-based writers to support writer protocol v2, we'll need to support enforcing invariants. It seems like the following signature could be reused by both implementations:

fn enforce_invariant(batch: RecordBatch, invariants: &Vec<(i32, &str)>) -> Result<(), DatafusionError> {
    // rough implementation:
    for (column_index, sql_invariant) in invariants {
        // ... (run data fusion query)
        // ... If failure, return error indicating which invariant failed.
    }
    Ok(())
}

Then this function could be applied to each record batch that comes in during a write.

We might also need to check whether the column is nullable and make sure we are enforcing that too, either as part of this or part of the schema enforcement. Should add a test for that.

What do you think @roeap?

Related Issue(s)

Related docs

https://github.com/delta-io/delta/blob/master/PROTOCOL.md#column-invariants
https://books.japila.pl/delta-lake-internals/constraints/Invariants/

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions