-
Notifications
You must be signed in to change notification settings - Fork 284
Avoid JSON marshal|unmarshal index params and table config for each search to improve fulltext|vector QPS #22381
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
You are nearing your monthly Qodo Merge usage quota. For more information, please visit here. PR Reviewer Guide 🔍Here are some key observations to aid the review process:
|
You are nearing your monthly Qodo Merge usage quota. For more information, please visit here. PR Code Suggestions ✨Explore these optional code suggestions:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
think carefully. This hardcoded configuration is no way back method...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I assume this PR is about the json parser is too slow. The official go json parser is indeed very slow. But there are many json parsers that is much much faster.
There is simdjson, and sonic from bytedance. Actually I think we already used segmentio.
Do we know if say, segmentio or sonic will be good enough?
Tried sonic, performance was twice as fast as Go's json, but still slow. Tried simdjson, but it was quite troublesome—maybe I didn't use it correctly—performance wasn't great |
User description
What type of PR is this?
Which issue(s) this PR fixes:
issue #22337
What this PR does / why we need it:
Avoid JSON marshal|unmarshal index params and table config for each search to improve QPS
PR Type
Enhancement, Tests
Description
• Performance optimization: Replaced JSON marshal/unmarshal operations with efficient binary-encoded index parameters and table configurations to improve fulltext and vector search QPS
• New binary parameter system: Introduced comprehensive
IndexParams
structures for IVFFLAT, HNSW, and fulltext algorithms with type-safe enums and validation• Unified configuration interface: Created
IndexTableCfgV1
with fixed-size binary layout for efficient storage and retrieval of index metadata• Code refactoring: Eliminated JSON-based parameter handling across DDL operations, search functions, and table creation workflows
• Enhanced type safety: Added structured parameter conversion utilities and validation functions to replace string-based parameter manipulation
• Comprehensive testing: Added extensive test coverage for new binary parameter system and updated existing tests to use new interfaces
• Code formatting improvements: Applied consistent formatting and structure improvements across multiple files
Diagram Walkthrough
File Walkthrough
28 files
index_params.go
Add binary-encoded index parameters system for performance
optimization
pkg/catalog/index_params.go
• Added comprehensive binary-encoded index parameter types and
structures for fulltext, IVFFLAT, and HNSW algorithms
• Implemented
efficient binary serialization/deserialization methods to avoid JSON
marshal/unmarshal overhead
• Added conversion utilities between AST
trees, JSON strings, and binary
IndexParams
format• Introduced
type-safe enums and validation for parser types, algorithm types, and
quantization types
build_ddl.go
Integrate binary index parameters in DDL operations
pkg/sql/plan/build_ddl.go
• Updated index creation logic to use new
AstTreeToIndexParams
function instead of
IndexParamsToJsonString
• Modified fulltext and
secondary index builders to store binary
IndexParams
instead of JSONstrings
• Replaced default IVFFLAT parameter generation with
DefaultIVFFLATV1Params()
function• Applied code formatting
improvements with better line breaks and error handling
index_table_cfg.go
Add binary-encoded index table configuration system
pkg/vectorindex/index_table_cfg.go
• Created new binary-encoded configuration system for index table
metadata
• Implemented
IndexTableCfgV1
with fixed-size binary layoutfor efficient storage and retrieval
• Added specialized
ExtraIVFCfgV1
for IVFFLAT-specific configuration parameters
• Provided conversion
utilities between JSON strings and binary configuration format
ddl.go
Update DDL compilation to use binary index parameters
pkg/sql/compile/ddl.go
• Updated ALTER TABLE index operations to use new binary
IndexParams
system
• Replaced JSON-based parameter manipulation with type-safe
binary operations
• Enhanced error handling and logging with more
descriptive messages throughout table creation
• Improved code
formatting and structure for better readability
hnsw.go
Integrate HNSW operations with binary index parameters
pkg/sql/plan/hnsw.go
• Updated HNSW build and search functions to use new
TryConvertToIndexParams
utility• Replaced direct parameter usage with
binary
IndexParams
conversion for consistency• Maintained existing
functionality while integrating with new parameter system
util.go
Eliminate JSON operations for index parameters in SQL compilation
pkg/sql/compile/util.go
• Removed JSON marshal/unmarshal operations for index parameters
•
Added proper parameter validation using
catalog.MustIndexParams
•
Replaced direct JSON operations with structured parameter handling
•
Updated SQL generation to use parameter string methods
ddl_index_algo.go
Refactor DDL index algorithm to avoid JSON operations
pkg/sql/compile/ddl_index_algo.go
• Replaced JSON parameter parsing with
catalog.MustIndexParams
calls•
Removed manual string-to-int conversions for index parameters
•
Updated IVF index handling to use structured parameter objects
•
Replaced JSON marshal operations with
BuildIVFIndexTableCfgV1
functionivf_search.go
Refactor IVF search to eliminate JSON parameter operations
pkg/sql/colexec/table_function/ivf_search.go
• Replaced JSON parameter parsing with
InitIVFCfgFromParam
function•
Updated to use
IndexTableCfgV1
instead of JSON unmarshaling• Changed
field access to method calls for table configuration
• Removed manual
parameter validation and conversion logic
hnsw_search.go
Refactor HNSW search to eliminate JSON parameter operations
pkg/sql/colexec/table_function/hnsw_search.go
• Added
InitHNSWCfgFromParam
function for parameter initialization•
Replaced JSON parameter parsing with structured parameter handling
•
Updated to use
IndexTableCfgV1
interface instead of JSON operations•
Removed manual string-to-int conversions for HNSW parameters
ivf_create.go
Refactor IVF create to eliminate JSON parameter operations
pkg/sql/colexec/table_function/ivf_create.go
• Replaced JSON parameter parsing with
InitIVFCfgFromParam
function•
Updated to use
IndexTableCfgV1
interface with method calls• Removed
manual parameter validation and string conversion logic
• Simplified
clustering function to use structured configuration
fulltext_tokenize.go
Refactor fulltext tokenize to eliminate JSON parameter operations
pkg/sql/colexec/table_function/fulltext_tokenize.go
• Added
InitFulltextCfgFromParam
function for parameter initialization• Replaced JSON parameter parsing with structured parameter handling
•
Improved error handling and code organization
• Simplified parameter
validation logic
hnsw_create.go
Refactor HNSW create to eliminate JSON parameter operations
pkg/sql/colexec/table_function/hnsw_create.go
• Replaced JSON parameter parsing with
InitHNSWCfgFromParam
function•
Updated to use
IndexTableCfgV1
interface instead of JSON operations•
Removed manual parameter validation and string conversion logic
•
Simplified configuration initialization process
fulltext.go
Refactor fulltext plan to use structured parameters
pkg/sql/plan/fulltext.go
• Updated fulltext parameter handling to use
catalog.TryConvertToIndexParams
• Replaced JSON parameter parsing with
structured parameter objects
• Modified SQL generation to use
parameter type methods
• Improved parameter validation and error
handling
build.go
Update HNSW build to use IndexTableCfg interface
pkg/vectorindex/hnsw/build.go
• Updated function signatures to use
IndexTableCfg
interface• Changed
field access from direct fields to method calls
• Modified SQL
generation to use configuration methods
• Updated constructor to
accept new interface type
fulltext.go
Replace JSON parsing with IndexParams for fulltext configuration
pkg/sql/colexec/table_function/fulltext.go
• Replaced JSON marshal/unmarshal with direct
catalog.IndexParams
parsing for fulltext index parameters
• Added
InitFulltextCfgFromParam
function to parse fulltext configuration from binary parameters
•
Refactored error handling to use
moerr.NewInvalidInputNoCtxf
insteadof context-based errors
• Improved function parameter formatting and
variable extraction
build_dml_util.go
Replace string-based index params with structured IndexParams parsing
pkg/sql/plan/build_dml_util.go
• Replaced
catalog.IndexParamsStringToMap
withcatalog.TryToIndexParams
for index parameter parsing• Added
validation for IVFFLAT algorithm parameters using
IVFFLATAlgo().IsValid()
• Improved function parameter formatting and
line breaks for better readability
• Enhanced error handling for
invalid algorithm parameters
apply_indices_hnsw.go
Replace JSON config with structured HNSW index configuration
pkg/sql/plan/apply_indices_hnsw.go
• Replaced JSON-based table configuration with
vectorindex.BuildIndexTableCfgV1
function• Updated HNSW parameter
parsing to use
catalog.TryToIndexParams
instead of string maps•
Removed dependency on
encoding/json
package• Simplified tree
expression creation by removing type parameters
search.go
Update IVFFLAT search to use new table configuration interface
pkg/vectorindex/ivfflat/search.go
• Updated interface from
IndexTableConfig
toIndexTableCfg
for tableconfiguration
• Modified SQL queries to use new configuration accessor
methods like
DBName()
andIndexTable()
• Updated function signatures
to use the new configuration interface
• Enhanced parameter handling
for IVFFLAT search operations
search.go
Update HNSW search to use new table configuration interface
pkg/vectorindex/hnsw/search.go
• Updated interface from
IndexTableConfig
toIndexTableCfg
forconsistency
• Modified SQL queries to use new configuration accessor
methods
• Updated function signatures and constructor calls to use new
interface
• Enhanced HNSW search configuration handling
apply_indices_ivfflat.go
Replace JSON config with structured IVFFLAT index configuration
pkg/sql/plan/apply_indices_ivfflat.go
• Replaced JSON-based configuration with
vectorindex.BuildIVFIndexTableCfgV1
function• Updated parameter
parsing to use
catalog.TryToIndexParams
instead of string maps•
Removed dependency on
encoding/json
package• Enhanced IVFFLAT
algorithm validation and parameter extraction
ivfflat.go
Add structured parameter conversion for IVFFLAT operations
pkg/sql/plan/ivfflat.go
• Added parameter conversion using
catalog.TryConvertToIndexParams
forIVFFLAT algorithms
• Enhanced parameter validation and error handling
for index creation and search
• Added proper algorithm name
specification for parameter conversion
• Improved function parameter
formatting
types.go
Replace metric type maps with structured conversion function
pkg/vectorindex/metric/types.go
• Replaced map-based metric type conversion with
GetIVFMetricType
function
• Added proper error handling for invalid metric types
•
Removed unused map variables and simplified metric type resolution
•
Enhanced type safety with explicit validation
product_l2.go
Replace metric type map lookup with structured function call
pkg/sql/colexec/productl2/product_l2.go
• Replaced map-based metric type lookup with
metric.GetIVFMetricType
function
• Enhanced error handling with proper validation of metric
types
• Improved variable declaration patterns and error message
clarity
postdml.go
Add parameter string conversion for fulltext postDML operations
pkg/sql/colexec/postdml/postdml.go
• Added
catalog.IndexParamsStrToJsonParamString
conversion forfulltext algorithm parameters
• Enhanced parameter handling for
fulltext index operations
• Improved SQL generation for fulltext
insert operations
index_metadata.go
Add structured parameter conversion for index metadata operations
pkg/sql/colexec/index_metadata.go
• Added conversion of index algorithm parameters to JSON format using
catalog.MustIndexParams
• Enhanced index metadata batch building with
proper parameter serialization
• Improved error handling and parameter
validation for index operations
build_show_util.go
Replace string-based parameter parsing with structured IndexParams
pkg/sql/plan/build_show_util.go
• Replaced
catalog.IndexParamsStringToMap
withcatalog.TryToIndexParams
for parameter parsing• Updated parser type
extraction to use structured parameter methods
• Enhanced parameter
list generation with
ToStringList()
method• Improved error handling
for index parameter operations
catalog.go
Add index parameter validation and conversion in catalog cache
pkg/vm/engine/disttae/cache/catalog.go
• Added parameter conversion and validation for index algorithms using
catalog.TryConvertToIndexParams
• Enhanced error handling with fatal
logging for invalid algorithm parameters
• Improved index parameter
consistency across table definitions
sql.go
Update IVFFLAT SQL operations to use new configuration interface
pkg/vectorindex/ivfflat/sql.go
• Updated function signature to use
IndexTableCfg
interface instead ofIndexTableConfig
• Modified SQL queries to use new configuration
accessor methods
• Enhanced consistency with other vectorindex
components
11 files
index_params_test.go
Add comprehensive test coverage for binary index parameters
pkg/catalog/index_params_test.go
• Added comprehensive test suite for new binary index parameter system
• Implemented tests for fulltext, IVFFLAT, and HNSW parameter types
and conversions
• Added validation tests for JSON-to-binary parameter
conversion functions
• Included edge case testing for invalid
parameters and error conditions
cache_test.go
Update cache tests to use new IndexTableCfg interface
pkg/vectorindex/cache/cache_test.go
• Updated type name from
IndexTableConfig
toIndexTableCfg
in mockstructs
• Replaced direct struct initialization with
BuildIndexTableCfgV1
function calls• Changed field access from direct
field access to method calls (e.g.,
tblcfg.IndexTable()
)hnsw_create_test.go
Update HNSW create tests with parameter validation
pkg/sql/colexec/table_function/hnsw_create_test.go
• Added parameter validation in test case creation using
catalog.IndexAlgoJsonParamStringToIndexParams
• Updated test
configuration to use
BuildIndexTableCfgV1
function• Modified test
failure scenarios to validate parameters during construction
index_table_cfg_test.go
Add comprehensive tests for IndexTableCfgV1 interface
pkg/vectorindex/index_table_cfg_test.go
• Added comprehensive test suite for new IndexTableCfgV1 interface
•
Tests cover ExtraIVFCfgV1, IndexTableCfgV1 construction, and JSON
serialization
• Includes validation of all configuration fields and
methods
ivf_search_test.go
Update IVF search tests with parameter validation
pkg/sql/colexec/table_function/ivf_search_test.go
• Added parameter validation in test construction using
catalog.IndexAlgoJsonParamStringToIndexParams
• Updated mock functions
to use
IndexTableCfg
interface• Modified test configuration to use
BuildIVFIndexTableCfgV1
function• Updated test failure scenarios to
validate parameters during construction
hnsw_search_test.go
Update HNSW search tests with parameter validation
pkg/sql/colexec/table_function/hnsw_search_test.go
• Added parameter validation in test construction using
catalog.IndexAlgoJsonParamStringToIndexParams
• Updated mock functions
to use
IndexTableCfg
interface• Modified test configuration to use
BuildIndexTableCfgV1
function• Updated test failure scenarios to
validate parameters during construction
fulltext_tokenize_test.go
Update fulltext tokenize tests with parameter validation
pkg/sql/colexec/table_function/fulltext_tokenize_test.go
• Added parameter validation in test construction using
catalog.IndexAlgoJsonParamStringToIndexParams
• Updated test
configuration to use structured parameter handling
• Added proper
error handling for parameter validation
build_test.go
Update HNSW build tests to use new configuration builder
pkg/vectorindex/hnsw/build_test.go
• Updated test configuration to use
vectorindex.BuildIndexTableCfgV1
instead of struct literals
• Modified test setup to use new
configuration builder function
• Enhanced test parameter specification
for better consistency
search_test.go
Update IVFFLAT search tests to use configuration builder
pkg/vectorindex/ivfflat/search_test.go
• Updated test configuration to use
vectorindex.BuildIndexTableCfgV1
builder function
• Replaced empty struct initialization with proper
configuration setup
• Enhanced test consistency across multiple test
functions
search_test.go
Update HNSW search tests to use new configuration interface
pkg/vectorindex/hnsw/search_test.go
• Updated test configuration to use
vectorindex.BuildIndexTableCfgV1
builder function
• Modified search function calls to use new
configuration accessor methods
• Enhanced test setup consistency and
parameter handling
vector_hnsw.result
Update HNSW test results to include quantization parameter
test/distributed/cases/vector/vector_hnsw.result
• Updated test results to include
quantization 'F32'
parameter in HNSWindex definitions
• Enhanced test output consistency with new index
parameter format
• Maintained existing test functionality while
updating expected results
4 files
secondary_index_utils.go
Remove JSON utility functions for index parameters
pkg/catalog/secondary_index_utils.go
• Removed multiple JSON-related utility functions for index parameters
• Eliminated
IndexParamsToStringList
,IndexParamsToJsonString
, andrelated functions
• Removed manual parameter validation and conversion
logic
• Simplified interface by removing JSON marshaling/unmarshaling
operations
txn_database.go
Refactor database creation code structure and error handling
pkg/vm/engine/disttae/txn_database.go
• Refactored variable declarations and error handling patterns for
better readability
• Changed variable naming from
m
tomp
for memorypool consistency
• Improved function parameter formatting and early
return patterns
• Enhanced error handling with early returns instead
of nested conditions
tuplesGen.go
Refactor column generation code structure and formatting
pkg/catalog/tuplesGen.go
• Improved function parameter formatting and variable declarations
•
Enhanced code structure with better variable initialization patterns
•
Refactored column generation logic for better readability
• Maintained
existing functionality while improving code organization
apply_indices_fulltext.go
Simplify fulltext index tree expression creation
pkg/sql/plan/apply_indices_fulltext.go
• Simplified tree expression creation by removing explicit type
parameters
• Updated
tree.NewNumVal
calls to use type inference•
Maintained existing functionality while improving code conciseness
6 files
engine.go
Improve database engine code formatting and structure
pkg/vm/engine/disttae/engine.go
• Improved function parameter formatting for database creation
operations
• Enhanced code readability with better line breaks and
parameter alignment
• Maintained existing functionality while
improving code structure
types.go
Improve engine types code formatting and readability
pkg/vm/engine/types.go
• Improved code formatting with better line breaks and parameter
alignment
• Enhanced readability of table definition conversion logic
• Maintained existing functionality while improving code structure
cache.go
Improve vector index cache code formatting
pkg/vectorindex/cache/cache.go
• Improved function parameter formatting for vector index cache search
operations
• Enhanced code readability with better parameter alignment
• Maintained existing functionality while improving code structure
compile2.go
Improve compile function parameter formatting
pkg/sql/compile/compile2.go
• Improved function parameter formatting for compile operations
•
Enhanced code readability with better line breaks
• Maintained
existing functionality while improving code structure
sql.go
Improve fulltext SQL function parameter formatting
pkg/fulltext/sql.go
• Improved function parameter formatting for pattern-to-SQL conversion
• Enhanced code readability with better parameter alignment
•
Maintained existing functionality while improving code structure
txn.go
Improve transaction write batch parameter formatting
pkg/vm/engine/disttae/txn.go
• Improved function parameter formatting for transaction write batch
operations
• Enhanced code readability with better parameter alignment
• Maintained existing functionality while improving code structure
1 files
apple
Update test file content with whitespace block
pkg/vectorindex/hnsw/apple
• Replaced empty file content with a large block of whitespace
characters
• This appears to be a test file or placeholder with no
functional code
1 files