Skip to content

[DISCUSSION] Better borrow propagation (e.g. RecordBatch::schema() to return &SchemaRef vs SchemaRef) #5463

@alamb

Description

@alamb

Is your feature request related to a problem or challenge? Please describe what you are trying to do.

For reasons I am not sure of (likely historical), certain APIs of Schema/RecordBatch return owned instances of SchemaRef.

So for example

let schema: SchemaRef = record_batch.schema()

There are at least two challenges here:

  1. This often requires an extra copy when not needed (though it is an Arc::clone which is relatively inexpensive compared to the work done by most Arrow apis)
  2. It makes it hard for borrow propagation to work correctly.

An example of challenges with borrow propagation is illustrated on #5342, which observes, we can't change Schema::try_merge to take (&Schema) even when it doesn't actually need owned ownership of the Schema without creating a temporary Vec<SchemaRef> or something similiar

-    pub fn try_merge(schemas: impl IntoIterator<Item = Self>) -> Result<Self, ArrowError> {
+    pub fn try_merge<'a>(
+        schemas: impl IntoIterator<Item = &'a Schema>,
+    ) -> Result<Schema, ArrowError> {

Describe the solution you'd like
I recommend we add new functions that return a reference to the SchemaRef

So new functions like

    /// Returns a reference to the [`Schema`] of the record batch.
    pub fn schema_ref(&self) -> &SchemaRef {
        &self.schema
    }

Describe alternatives you've considered
One alternative is to change the existing APIs as proposed in #5448

    /// Returns a reference to the [`Schema`] of the record batch.
    pub fn schema(&self) -> &SchemaRef {
        &self.schema
    }

However I think this requires a substantial amount of downstream code change (as explained on #5448 (review))

Additional context
This came up recently on #5342

@tustvold notes we have made similar changes in the past, e.g. #2035 and #313

Metadata

Metadata

Assignees

No one assigned

    Labels

    arrowChanges to the arrow cratearrow-flightChanges to the arrow-flight crateenhancementAny new improvement worthy of a entry in the changelogparquetChanges to the parquet crate

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions