feat(datafusion): Full DML support for tables with evolved partition specs

## Full DML support for tables with evolved partition specs

### Summary

UPDATE and DELETE operations currently fail on tables that have undergone partition evolution. The DataFusion integration assumes all files use the current default partition spec, which isn't true for tables where the partitioning scheme has changed over time.

### Background

Iceberg supports changing a table's partition scheme without rewriting existing data (partition evolution). Each data file tracks which partition spec was in effect when it was written via `partition_spec_id`. A table might look like:

```
Spec 0: PARTITION BY (date)           → wrote files A, B, C
Spec 1: PARTITION BY (date, region)   → wrote files D, E
Current default: Spec 1
```

When you run an UPDATE that touches both old and new files, the current code tries to serialize all partition data using Spec 1's schema. Files from Spec 0 don't have a `region` field, so serialization fails or produces garbage.

### Current Behavior

There's a guard in `physical_plan/update.rs` that returns `FeatureNotSupported` when it encounters files with non-default spec IDs. This prevents corruption but blocks legitimate use cases.

### Proposed Changes

1. **Expose spec ID on DataFile** - Make `partition_spec_id()` public so downstream code can access it

2. **Add helper on Table** - Something like `partition_type_for_spec(spec_id)` to look up the correct partition schema for any spec in the table's history

3. **Thread spec ID through the pipeline** - Carry the original DataFile (or at least its spec ID) through scan → transform → commit stages instead of just the partition values

4. **Per-file serialization** - When serializing partition data for delete files and commits, look up the correct spec for each file rather than assuming default

5. **Remove the evolution guard** - Once correctness is guaranteed, remove the `FeatureNotSupported` error

### Key invariants to maintain

- Delete files must use the same spec as their source data file
- New data files from UPDATE should use the current default spec
- Missing/invalid spec IDs should fail with a clear error, not silent corruption
- Tables written by iceberg-rust should remain readable by Spark/Trino/etc

### Test coverage needed

- UPDATE touching files from multiple specs
- DELETE across evolved partitions  
- Round-trip tests: serialize with spec N, deserialize with spec N, verify unchanged
- Cross-engine compatibility (Spark can read what we write)
- Error cases: invalid spec ID references

### Related

- RowDelta action (prerequisite, provides atomic commit mechanism)
- Compaction (#624) - will also need this for compacting across specs
- Delete support (#735)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(datafusion): Full DML support for tables with evolved partition specs #1923

Full DML support for tables with evolved partition specs

Summary

Background

Current Behavior

Proposed Changes

Key invariants to maintain

Test coverage needed

Related

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

feat(datafusion): Full DML support for tables with evolved partition specs #1923

Description

Full DML support for tables with evolved partition specs

Summary

Background

Current Behavior

Proposed Changes

Key invariants to maintain

Test coverage needed

Related

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions