Skip to content

Perf: Support automatically concat_batches for sort which will improve performance #15375

Open
@zhuqi-lucas

Description

@zhuqi-lucas

Is your feature request related to a problem or challenge?

We should investigate and improve the sort code to support concat_batches for more cases besides the following case:

       // If less than sort_in_place_threshold_bytes, concatenate and sort in place
        if self.reservation.size() < self.sort_in_place_threshold_bytes {
            // Concatenate memory batches together and sort
            let batch = concat_batches(&self.schema, &self.in_mem_batches)?;
            self.in_mem_batches.clear();
            self.reservation
                .try_resize(get_reserved_byte_for_record_batch(&batch))?;
            let reservation = self.reservation.take();
            return self.sort_batch_stream(batch, metrics, reservation);
        }

See details about the performance improvement:

#15348 (comment)

Describe the solution you'd like

No response

Describe alternatives you've considered

No response

Additional context

No response

Metadata

Metadata

Assignees

Labels

enhancementNew feature or request

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions