Skip to content

Conversation

@silver-ymz
Copy link
Contributor

@silver-ymz silver-ymz commented Nov 27, 2025

I hereby agree to the terms of the RisingWave Labs, Inc. Contributor License Agreement.

What's changed and what's your intention?

close #23891

CorrelatedTopNToVectorSearchRule only specifies following logical plan pattern.

LogicalApply { type: LeftOuter, on: true, correlated_id, max_one_row: true }
 └─ LogicalProject { exprs: [[Coalesce(array_agg($expr1 order_by($expr2 ASC)), ARRAY[]) as $expr3] }
    └─ LogicalAgg { aggs: [array_agg($expr1 order_by($expr2 ASC))] }
       └─ LogicalTopN { order: [$expr2 ASC] }
          └─ LogicalProject { exprs: [Row(...) as $expr1, L2Distance(CorrelatedInputRef { index: 0, correlated_id: 1 }, items.embedding) as $expr2] }

This logical plan should have schema same as array(row(...)), so the inner struct type should be an unnamed struct.

Checklist

  • I have written necessary rustdoc comments.
  • I have added necessary unit tests and integration tests.
  • I have added test labels as necessary.
  • I have added fuzzing tests or opened an issue to track them.
  • My PR contains breaking changes.
  • My PR changes performance-critical code, so I will run (micro) benchmarks and present the results.
  • I have checked the Release Timeline and Currently Supported Versions to determine which release branches I need to cherry-pick this PR into.

Documentation

  • My PR needs documentation updates.
Release note

Signed-off-by: Mingzhuo Yin <yinmingzhuo@gmail.com>
@github-actions github-actions bot added the type/fix Type: Bug fix. Only for pull requests. label Nov 27, 2025
Copilot finished reviewing on behalf of silver-ymz November 27, 2025 12:03
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR fixes an issue where LogicalVectorSearchLookupJoin was creating a struct type with named fields when it should use unnamed fields. The fix involves changing StructType::unnamed() to accept an iterator directly, eliminating unnecessary allocations and improving code ergonomics.

  • Refactored StructType::unnamed() to accept impl IntoIterator<Item = DataType> instead of Vec<DataType>
  • Fixed VectorSearchLookupJoinCore::struct_type() to use unnamed struct fields
  • Removed unnecessary .collect() calls and unused imports across the codebase

Reviewed changes

Copilot reviewed 6 out of 6 changed files in this pull request and generated no comments.

Show a summary per file
File Description
src/common/src/types/struct_type.rs Modified unnamed() method signature to accept any iterator of DataType
src/frontend/src/optimizer/plan_node/logical_vector_search_lookup_join.rs Fixed to use StructType::unnamed() instead of StructType::new(), ensuring correct unnamed struct schema
src/frontend/src/optimizer/rule/unify_first_last_value_rule.rs Removed unnecessary .collect() call when building field types
src/frontend/src/binder/expr/value.rs Removed unused itertools::Itertools import and .collect_vec() call
src/common/src/array/struct_array.rs Removed .collect() calls in from_protobuf() and From<DataChunk> implementation
src/common/src/array/data_chunk.rs Simplified struct type creation by removing intermediate .collect()

Signed-off-by: Mingzhuo Yin <yinmingzhuo@gmail.com>
Copy link
Contributor

@wenym1 wenym1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rest LGTM. Thanks for the quick fix!


fn struct_type(&self) -> StructType {
StructType::new(
StructType::unnamed(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shall we have a unified method in StructType like fn row_expr_type(types: impl Iterator<Item = DataType>) and call in both here and in bind_row so that we can explicitly declare that here we construct the struct type in the same way as row?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

updated in new commit

Signed-off-by: Mingzhuo Yin <yinmingzhuo@gmail.com>
@silver-ymz silver-ymz force-pushed the fix/CorrelatedTopNToVectorSearchRule-output-schema branch from eca2400 to 97dded6 Compare November 28, 2025 05:20
@wenym1 wenym1 added this pull request to the merge queue Nov 28, 2025
Merged via the queue into main with commit 9f6e1a8 Nov 28, 2025
34 checks passed
@wenym1 wenym1 deleted the fix/CorrelatedTopNToVectorSearchRule-output-schema branch November 28, 2025 08:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

type/fix Type: Bug fix. Only for pull requests.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

CorrelatedTopNToVectorSearchRule have inconsistent output schema

3 participants