Iceberg add_files procedure with partition_filter scan non needed folders

### Apache Iceberg version

1.1.0 (latest release)

### Query engine

Spark

### Please describe the bug 🐞

source structure example: `s3://bucket/data/id=123/name=test/date=321/result.orc`
```
CALL iceberg_catalog.system.add_files(
    table => 'test.test_name',
    source_table => '`orc`.`s3://bucket/data/`',
    partition_filter => map('id', '3')
    check_duplicate_files => false
```
`partition_filter` option does not handle the order of partition, which produces nested folders scanning until finding the first match. Should we run filter by partition in order before run nested `Listing leaf files and directories`?

**Example of current flow:**
 
```
s3://bucket/data/id=1/name=test/date=321/result.orc -> Listing leaf files and directories on each sub folder 
s3://bucket/data/id=2/name=test/date=321/result.orc -> Listing leaf files and directories on each sub folder
s3://bucket/data/id=3/name=test/date=321/result.orc -> Match needed partition_filter
s3://bucket/data/id=4/name=test/date=321/result.orc -> Listing leaf files and directories on each sub folder
```


Also if i have `partition_by` ` id, name, date` in table and specify 
```
CALL iceberg_catalog.system.add_files(
    table => 'test.test_name',
    source_table => '`orc`.`s3://bucket/data/id=1/name=test/`',
    check_duplicate_files => false
```
Iceberg will ignore these partitions and set them as `null` in table, instead of pulling these data from the path, in spark it's handled by `basePath` before reading the partitions but here is used InMemoryFileIndex without the possibility to do that?

cc @RussellSpitzer @szehon-ho 

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Iceberg add_files procedure with partition_filter scan non needed folders #7027

Apache Iceberg version

Query engine

Please describe the bug 🐞

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Iceberg add_files procedure with partition_filter scan non needed folders #7027

Description

Apache Iceberg version

Query engine

Please describe the bug 🐞

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions