Inconsistent column name case handling when round tripping names from arrow metadata

### Describe the bug

Given a table in a `SessionContext` and the `RecordBatch` that backs it (e.g. through `ctx.register_batch()`), I want to refer to the table's columns using the field names found in the `RecordBatch`'s schema.

In some cases, this fails with `col(col_name)`. I must instead use `col(format!("\"{col_name}\""))`, which is hard to discover and likely to be missed even when one is aware of the issue. This is compounded by the fact that specific column names will trigger the failure, like "A", but not "Column A". (I'm now assuming that the space in the latter triggers some auto-escaping mechanism.)

### To Reproduce

```rust
use arrow::array::{Int64Array, RecordBatch};
use arrow::datatypes::{DataType, Field, Schema};
use datafusion::common::DataFusionError;
use datafusion::logical_expr::col;
use datafusion::prelude::SessionContext;
use std::sync::Arc;

async fn test_single_column(col_name: &str) -> Result<(), DataFusionError> {
    // create a simple batch
    let column = Int64Array::from(vec![1, 2, 3]);
    let schema = Schema::new(vec![Field::new(col_name, DataType::Int64, false)]);

    println!("Column name: {col_name}");
    println!("Initial arrow schema name: {}", schema.fields()[0].name());

    let batch = RecordBatch::try_new(Arc::new(schema), vec![Arc::new(column)])
        .expect("could not create record batch");

    // create a DataFusion context
    let ctx = SessionContext::new();
    ctx.register_batch("test", batch)?;

    println!(
        "Session context schema name: {}",
        ctx.table("test").await?.schema().fields()[0].name()
    );

    let result = ctx
        .table("test")
        .await?
        .select(vec![col(col_name)])?
        // use this instead to avoid the issue
        //.select(vec![col(format!("\"{col_name}\""))])?
        .collect()
        .await?
        .into_iter()
        .last()
        .ok_or(DataFusionError::External("no batch returned".into()))?;

    println!(
        "Result batch schema name: {}",
        result.schema().fields()[0].name()
    );

    Ok(())
}

#[tokio::main]
async fn main() -> anyhow::Result<()> {
    let names = &["A", "a", "Column A"];

    for name in names {
        if let Err(e) = test_single_column(name).await {
            eprintln!("Error processing column name '{}': {}", name, e);
        }

        println!("--------------------------------");
    }

    Ok(())
}
```

### Result:

```
Column name: A
Initial arrow schema name: A
Session context schema name: A
Error processing column name 'A': Schema error: No field named a. Valid fields are test."A".
--------------------------------
Column name: a
Initial arrow schema name: a
Session context schema name: a
Result batch schema name: a
--------------------------------
Column name: Column A
Initial arrow schema name: Column A
Session context schema name: Column A
Result batch schema name: Column A
```

Noteworthy:
- The error message is particularly confusing, since I _did_ use `"A"` (edit: well, once you know the issue, you may note the quotes making the message technically correct)
- The (seemingly) inconsistent behaviour between "A" and "Column A" (with the latter actually working).


### Expected behavior

All three test cases pass

### Additional context

In this test case, like in the actual codebase I'm working on, I am not making use of any SQL. This makes name casing issue particularly unexpected.

Probably related:
- https://github.com/apache/datafusion/issues/14832
- https://github.com/apache/datafusion/issues/14373
- https://github.com/apache/datafusion/issues/13649

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Inconsistent column name case handling when round tripping names from arrow metadata #15922

Describe the bug

To Reproduce

Result:

Expected behavior

Additional context

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Inconsistent column name case handling when round tripping names from arrow metadata #15922

Description

Describe the bug

To Reproduce

Result:

Expected behavior

Additional context

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions