Skip to content

Conversation

XuPeng-SH
Copy link
Contributor

@XuPeng-SH XuPeng-SH commented Aug 19, 2025

User description

What type of PR is this?

  • API-change
  • BUG
  • Improvement
  • Documentation
  • Feature
  • Test and CI
  • Code Refactoring

Which issue(s) this PR fixes:

issue #22337

What this PR does / why we need it:

Avoid JSON marshal|unmarshal index params and table config for each search to improve QPS

image image

PR Type

Enhancement, Tests


Description

Performance optimization: Replaced JSON marshal/unmarshal operations with efficient binary-encoded index parameters and table configurations to improve fulltext and vector search QPS
New binary parameter system: Introduced comprehensive IndexParams structures for IVFFLAT, HNSW, and fulltext algorithms with type-safe enums and validation
Unified configuration interface: Created IndexTableCfgV1 with fixed-size binary layout for efficient storage and retrieval of index metadata
Code refactoring: Eliminated JSON-based parameter handling across DDL operations, search functions, and table creation workflows
Enhanced type safety: Added structured parameter conversion utilities and validation functions to replace string-based parameter manipulation
Comprehensive testing: Added extensive test coverage for new binary parameter system and updated existing tests to use new interfaces
Code formatting improvements: Applied consistent formatting and structure improvements across multiple files


Diagram Walkthrough

flowchart LR
  A["JSON Parameters"] --> B["Binary IndexParams"]
  C["String-based Config"] --> D["IndexTableCfgV1"]
  B --> E["DDL Operations"]
  B --> F["Search Functions"]
  D --> G["Vector Index Cache"]
  D --> H["Table Functions"]
  E --> I["Improved QPS"]
  F --> I
  G --> I
  H --> I
Loading

File Walkthrough

Relevant files
Enhancement
28 files
index_params.go
Add binary-encoded index parameters system for performance
optimization

pkg/catalog/index_params.go

• Added comprehensive binary-encoded index parameter types and
structures for fulltext, IVFFLAT, and HNSW algorithms
• Implemented
efficient binary serialization/deserialization methods to avoid JSON
marshal/unmarshal overhead
• Added conversion utilities between AST
trees, JSON strings, and binary IndexParams format
• Introduced
type-safe enums and validation for parser types, algorithm types, and
quantization types

+865/-0 
build_ddl.go
Integrate binary index parameters in DDL operations           

pkg/sql/plan/build_ddl.go

• Updated index creation logic to use new AstTreeToIndexParams
function instead of IndexParamsToJsonString
• Modified fulltext and
secondary index builders to store binary IndexParams instead of JSON
strings
• Replaced default IVFFLAT parameter generation with
DefaultIVFFLATV1Params() function
• Applied code formatting
improvements with better line breaks and error handling

+208/-86
index_table_cfg.go
Add binary-encoded index table configuration system           

pkg/vectorindex/index_table_cfg.go

• Created new binary-encoded configuration system for index table
metadata
• Implemented IndexTableCfgV1 with fixed-size binary layout
for efficient storage and retrieval
• Added specialized ExtraIVFCfgV1
for IVFFLAT-specific configuration parameters
• Provided conversion
utilities between JSON strings and binary configuration format

+619/-0 
ddl.go
Update DDL compilation to use binary index parameters       

pkg/sql/compile/ddl.go

• Updated ALTER TABLE index operations to use new binary IndexParams
system
• Replaced JSON-based parameter manipulation with type-safe
binary operations
• Enhanced error handling and logging with more
descriptive messages throughout table creation
• Improved code
formatting and structure for better readability

+140/-74
hnsw.go
Integrate HNSW operations with binary index parameters     

pkg/sql/plan/hnsw.go

• Updated HNSW build and search functions to use new
TryConvertToIndexParams utility
• Replaced direct parameter usage with
binary IndexParams conversion for consistency
• Maintained existing
functionality while integrating with new parameter system

+17/-2   
util.go
Eliminate JSON operations for index parameters in SQL compilation

pkg/sql/compile/util.go

• Removed JSON marshal/unmarshal operations for index parameters

Added proper parameter validation using catalog.MustIndexParams

Replaced direct JSON operations with structured parameter handling

Updated SQL generation to use parameter string methods

+99/-43 
ddl_index_algo.go
Refactor DDL index algorithm to avoid JSON operations       

pkg/sql/compile/ddl_index_algo.go

• Replaced JSON parameter parsing with catalog.MustIndexParams calls

Removed manual string-to-int conversions for index parameters

Updated IVF index handling to use structured parameter objects

Replaced JSON marshal operations with BuildIVFIndexTableCfgV1 function

+73/-68 
ivf_search.go
Refactor IVF search to eliminate JSON parameter operations

pkg/sql/colexec/table_function/ivf_search.go

• Replaced JSON parameter parsing with InitIVFCfgFromParam function

Updated to use IndexTableCfgV1 instead of JSON unmarshaling
• Changed
field access to method calls for table configuration
• Removed manual
parameter validation and conversion logic

+79/-52 
hnsw_search.go
Refactor HNSW search to eliminate JSON parameter operations

pkg/sql/colexec/table_function/hnsw_search.go

• Added InitHNSWCfgFromParam function for parameter initialization

Replaced JSON parameter parsing with structured parameter handling

Updated to use IndexTableCfgV1 interface instead of JSON operations

Removed manual string-to-int conversions for HNSW parameters

+71/-62 
ivf_create.go
Refactor IVF create to eliminate JSON parameter operations

pkg/sql/colexec/table_function/ivf_create.go

• Replaced JSON parameter parsing with InitIVFCfgFromParam function

Updated to use IndexTableCfgV1 interface with method calls
• Removed
manual parameter validation and string conversion logic
• Simplified
clustering function to use structured configuration

+41/-62 
fulltext_tokenize.go
Refactor fulltext tokenize to eliminate JSON parameter operations

pkg/sql/colexec/table_function/fulltext_tokenize.go

• Added InitFulltextCfgFromParam function for parameter initialization

• Replaced JSON parameter parsing with structured parameter handling

Improved error handling and code organization
• Simplified parameter
validation logic

+46/-35 
hnsw_create.go
Refactor HNSW create to eliminate JSON parameter operations

pkg/sql/colexec/table_function/hnsw_create.go

• Replaced JSON parameter parsing with InitHNSWCfgFromParam function

Updated to use IndexTableCfgV1 interface instead of JSON operations

Removed manual parameter validation and string conversion logic

Simplified configuration initialization process

+29/-67 
fulltext.go
Refactor fulltext plan to use structured parameters           

pkg/sql/plan/fulltext.go

• Updated fulltext parameter handling to use
catalog.TryConvertToIndexParams
• Replaced JSON parameter parsing with
structured parameter objects
• Modified SQL generation to use
parameter type methods
• Improved parameter validation and error
handling

+42/-18 
build.go
Update HNSW build to use IndexTableCfg interface                 

pkg/vectorindex/hnsw/build.go

• Updated function signatures to use IndexTableCfg interface
• Changed
field access from direct fields to method calls
• Modified SQL
generation to use configuration methods
• Updated constructor to
accept new interface type

+15/-10 
fulltext.go
Replace JSON parsing with IndexParams for fulltext configuration

pkg/sql/colexec/table_function/fulltext.go

• Replaced JSON marshal/unmarshal with direct catalog.IndexParams
parsing for fulltext index parameters
• Added InitFulltextCfgFromParam
function to parse fulltext configuration from binary parameters

Refactored error handling to use moerr.NewInvalidInputNoCtxf instead
of context-based errors
• Improved function parameter formatting and
variable extraction

+61/-14 
build_dml_util.go
Replace string-based index params with structured IndexParams parsing

pkg/sql/plan/build_dml_util.go

• Replaced catalog.IndexParamsStringToMap with
catalog.TryToIndexParams for index parameter parsing
• Added
validation for IVFFLAT algorithm parameters using
IVFFLATAlgo().IsValid()
• Improved function parameter formatting and
line breaks for better readability
• Enhanced error handling for
invalid algorithm parameters

+33/-10 
apply_indices_hnsw.go
Replace JSON config with structured HNSW index configuration

pkg/sql/plan/apply_indices_hnsw.go

• Replaced JSON-based table configuration with
vectorindex.BuildIndexTableCfgV1 function
• Updated HNSW parameter
parsing to use catalog.TryToIndexParams instead of string maps

Removed dependency on encoding/json package
• Simplified tree
expression creation by removing type parameters

+22/-18 
search.go
Update IVFFLAT search to use new table configuration interface

pkg/vectorindex/ivfflat/search.go

• Updated interface from IndexTableConfig to IndexTableCfg for table
configuration
• Modified SQL queries to use new configuration accessor
methods like DBName() and IndexTable()
• Updated function signatures
to use the new configuration interface
• Enhanced parameter handling
for IVFFLAT search operations

+22/-7   
search.go
Update HNSW search to use new table configuration interface

pkg/vectorindex/hnsw/search.go

• Updated interface from IndexTableConfig to IndexTableCfg for
consistency
• Modified SQL queries to use new configuration accessor
methods
• Updated function signatures and constructor calls to use new
interface
• Enhanced HNSW search configuration handling

+17/-6   
apply_indices_ivfflat.go
Replace JSON config with structured IVFFLAT index configuration

pkg/sql/plan/apply_indices_ivfflat.go

• Replaced JSON-based configuration with
vectorindex.BuildIVFIndexTableCfgV1 function
• Updated parameter
parsing to use catalog.TryToIndexParams instead of string maps

Removed dependency on encoding/json package
• Enhanced IVFFLAT
algorithm validation and parameter extraction

+23/-23 
ivfflat.go
Add structured parameter conversion for IVFFLAT operations

pkg/sql/plan/ivfflat.go

• Added parameter conversion using catalog.TryConvertToIndexParams for
IVFFLAT algorithms
• Enhanced parameter validation and error handling
for index creation and search
• Added proper algorithm name
specification for parameter conversion
• Improved function parameter
formatting

+23/-3   
types.go
Replace metric type maps with structured conversion function

pkg/vectorindex/metric/types.go

• Replaced map-based metric type conversion with GetIVFMetricType
function
• Added proper error handling for invalid metric types

Removed unused map variables and simplified metric type resolution

Enhanced type safety with explicit validation

+17/-21 
product_l2.go
Replace metric type map lookup with structured function call

pkg/sql/colexec/productl2/product_l2.go

• Replaced map-based metric type lookup with metric.GetIVFMetricType
function
• Enhanced error handling with proper validation of metric
types
• Improved variable declaration patterns and error message
clarity

+6/-7     
postdml.go
Add parameter string conversion for fulltext postDML operations

pkg/sql/colexec/postdml/postdml.go

• Added catalog.IndexParamsStrToJsonParamString conversion for
fulltext algorithm parameters
• Enhanced parameter handling for
fulltext index operations
• Improved SQL generation for fulltext
insert operations

+12/-3   
index_metadata.go
Add structured parameter conversion for index metadata operations

pkg/sql/colexec/index_metadata.go

• Added conversion of index algorithm parameters to JSON format using
catalog.MustIndexParams
• Enhanced index metadata batch building with
proper parameter serialization
• Improved error handling and parameter
validation for index operations

+9/-1     
build_show_util.go
Replace string-based parameter parsing with structured IndexParams

pkg/sql/plan/build_show_util.go

• Replaced catalog.IndexParamsStringToMap with
catalog.TryToIndexParams for parameter parsing
• Updated parser type
extraction to use structured parameter methods
• Enhanced parameter
list generation with ToStringList() method
• Improved error handling
for index parameter operations

+4/-6     
catalog.go
Add index parameter validation and conversion in catalog cache

pkg/vm/engine/disttae/cache/catalog.go

• Added parameter conversion and validation for index algorithms using
catalog.TryConvertToIndexParams
• Enhanced error handling with fatal
logging for invalid algorithm parameters
• Improved index parameter
consistency across table definitions

+20/-0   
sql.go
Update IVFFLAT SQL operations to use new configuration interface

pkg/vectorindex/ivfflat/sql.go

• Updated function signature to use IndexTableCfg interface instead of
IndexTableConfig
• Modified SQL queries to use new configuration
accessor methods
• Enhanced consistency with other vectorindex
components

+3/-3     
Tests
11 files
index_params_test.go
Add comprehensive test coverage for binary index parameters

pkg/catalog/index_params_test.go

• Added comprehensive test suite for new binary index parameter system

• Implemented tests for fulltext, IVFFLAT, and HNSW parameter types
and conversions
• Added validation tests for JSON-to-binary parameter
conversion functions
• Included edge case testing for invalid
parameters and error conditions

+386/-0 
cache_test.go
Update cache tests to use new IndexTableCfg interface       

pkg/vectorindex/cache/cache_test.go

• Updated type name from IndexTableConfig to IndexTableCfg in mock
structs
• Replaced direct struct initialization with
BuildIndexTableCfgV1 function calls
• Changed field access from direct
field access to method calls (e.g., tblcfg.IndexTable())

+107/-22
hnsw_create_test.go
Update HNSW create tests with parameter validation             

pkg/sql/colexec/table_function/hnsw_create_test.go

• Added parameter validation in test case creation using
catalog.IndexAlgoJsonParamStringToIndexParams
• Updated test
configuration to use BuildIndexTableCfgV1 function
• Modified test
failure scenarios to validate parameters during construction

+66/-31 
index_table_cfg_test.go
Add comprehensive tests for IndexTableCfgV1 interface       

pkg/vectorindex/index_table_cfg_test.go

• Added comprehensive test suite for new IndexTableCfgV1 interface

Tests cover ExtraIVFCfgV1, IndexTableCfgV1 construction, and JSON
serialization
• Includes validation of all configuration fields and
methods

+244/-0 
ivf_search_test.go
Update IVF search tests with parameter validation               

pkg/sql/colexec/table_function/ivf_search_test.go

• Added parameter validation in test construction using
catalog.IndexAlgoJsonParamStringToIndexParams
• Updated mock functions
to use IndexTableCfg interface
• Modified test configuration to use
BuildIVFIndexTableCfgV1 function
• Updated test failure scenarios to
validate parameters during construction

+49/-33 
hnsw_search_test.go
Update HNSW search tests with parameter validation             

pkg/sql/colexec/table_function/hnsw_search_test.go

• Added parameter validation in test construction using
catalog.IndexAlgoJsonParamStringToIndexParams
• Updated mock functions
to use IndexTableCfg interface
• Modified test configuration to use
BuildIndexTableCfgV1 function
• Updated test failure scenarios to
validate parameters during construction

+39/-30 
fulltext_tokenize_test.go
Update fulltext tokenize tests with parameter validation 

pkg/sql/colexec/table_function/fulltext_tokenize_test.go

• Added parameter validation in test construction using
catalog.IndexAlgoJsonParamStringToIndexParams
• Updated test
configuration to use structured parameter handling
• Added proper
error handling for parameter validation

+7/-1     
build_test.go
Update HNSW build tests to use new configuration builder 

pkg/vectorindex/hnsw/build_test.go

• Updated test configuration to use vectorindex.BuildIndexTableCfgV1
instead of struct literals
• Modified test setup to use new
configuration builder function
• Enhanced test parameter specification
for better consistency

+22/-10 
search_test.go
Update IVFFLAT search tests to use configuration builder 

pkg/vectorindex/ivfflat/search_test.go

• Updated test configuration to use vectorindex.BuildIndexTableCfgV1
builder function
• Replaced empty struct initialization with proper
configuration setup
• Enhanced test consistency across multiple test
functions

+33/-3   
search_test.go
Update HNSW search tests to use new configuration interface

pkg/vectorindex/hnsw/search_test.go

• Updated test configuration to use vectorindex.BuildIndexTableCfgV1
builder function
• Modified search function calls to use new
configuration accessor methods
• Enhanced test setup consistency and
parameter handling

+18/-2   
vector_hnsw.result
Update HNSW test results to include quantization parameter

test/distributed/cases/vector/vector_hnsw.result

• Updated test results to include quantization 'F32' parameter in HNSW
index definitions
• Enhanced test output consistency with new index
parameter format
• Maintained existing test functionality while
updating expected results

+15/-15 
Code refactoring
4 files
secondary_index_utils.go
Remove JSON utility functions for index parameters             

pkg/catalog/secondary_index_utils.go

• Removed multiple JSON-related utility functions for index parameters

• Eliminated IndexParamsToStringList, IndexParamsToJsonString, and
related functions
• Removed manual parameter validation and conversion
logic
• Simplified interface by removing JSON marshaling/unmarshaling
operations

+2/-199 
txn_database.go
Refactor database creation code structure and error handling

pkg/vm/engine/disttae/txn_database.go

• Refactored variable declarations and error handling patterns for
better readability
• Changed variable naming from m to mp for memory
pool consistency
• Improved function parameter formatting and early
return patterns
• Enhanced error handling with early returns instead
of nested conditions

+35/-24 
tuplesGen.go
Refactor column generation code structure and formatting 

pkg/catalog/tuplesGen.go

• Improved function parameter formatting and variable declarations

Enhanced code structure with better variable initialization patterns

Refactored column generation logic for better readability
• Maintained
existing functionality while improving code organization

+23/-17 
apply_indices_fulltext.go
Simplify fulltext index tree expression creation                 

pkg/sql/plan/apply_indices_fulltext.go

• Simplified tree expression creation by removing explicit type
parameters
• Updated tree.NewNumVal calls to use type inference

Maintained existing functionality while improving code conciseness

+5/-6     
Formatting
6 files
engine.go
Improve database engine code formatting and structure       

pkg/vm/engine/disttae/engine.go

• Improved function parameter formatting for database creation
operations
• Enhanced code readability with better line breaks and
parameter alignment
• Maintained existing functionality while
improving code structure

+24/-5   
types.go
Improve engine types code formatting and readability         

pkg/vm/engine/types.go

• Improved code formatting with better line breaks and parameter
alignment
• Enhanced readability of table definition conversion logic

• Maintained existing functionality while improving code structure

+19/-10 
cache.go
Improve vector index cache code formatting                             

pkg/vectorindex/cache/cache.go

• Improved function parameter formatting for vector index cache search
operations
• Enhanced code readability with better parameter alignment

• Maintained existing functionality while improving code structure

+7/-2     
compile2.go
Improve compile function parameter formatting                       

pkg/sql/compile/compile2.go

• Improved function parameter formatting for compile operations

Enhanced code readability with better line breaks
• Maintained
existing functionality while improving code structure

+2/-1     
sql.go
Improve fulltext SQL function parameter formatting             

pkg/fulltext/sql.go

• Improved function parameter formatting for pattern-to-SQL conversion

• Enhanced code readability with better parameter alignment

Maintained existing functionality while improving code structure

+7/-1     
txn.go
Improve transaction write batch parameter formatting         

pkg/vm/engine/disttae/txn.go

• Improved function parameter formatting for transaction write batch
operations
• Enhanced code readability with better parameter alignment

• Maintained existing functionality while improving code structure

+2/-1     
Miscellaneous
1 files
apple
Update test file content with whitespace block                     

pkg/vectorindex/hnsw/apple

• Replaced empty file content with a large block of whitespace
characters
• This appears to be a test file or placeholder with no
functional code

[link]   
Additional files
1 files
types.go +0/-38   

Copy link

You are nearing your monthly Qodo Merge usage quota. For more information, please visit here.

PR Reviewer Guide 🔍

Here are some key observations to aid the review process:

⏱️ Estimated effort to review: 4 🔵🔵🔵🔵⚪
🧪 PR contains tests
🔒 No security concerns identified
⚡ Recommended focus areas for review

Validation Robustness

Several getters (e.g., IVFFLATList/Algo, HNSW fields) only check buffer length against field offsets, not full size; ensure bounds checks prevent partial buffers from decoding beyond slice length. Also confirm TryConvertToIndexParams correctly distinguishes JSON vs binary to avoid false positives.

func (params IndexParams) IVFFLATList() int64 {
	if len(params) < IndexParams_IVFFLATV1_ListOff {
		return 0
	}
	return types.DecodeFixed[int64](params[IndexParams_IVFFLATV1_ListOff:])
}

func (params IndexParams) IVFFLATAlgo() IndexParamAlgoType {
	if len(params) < IndexParams_IVFFLATV1_AlgoOff {
		return IndexParamAlgoType_Invalid
	}
	return IndexParamAlgoType(types.DecodeFixed[uint16](params[IndexParams_IVFFLATV1_AlgoOff:]))
}

func (params IndexParams) SetIVFFLATList(list int64) {
	copy(params[IndexParams_IVFFLATV1_ListOff:], types.EncodeFixed(list))
}

// ------------------------[HNSW V1 PARAMS] IndexParams------------------------
const (
	IndexParams_HNSWV1_MOff              = IndexParams_HeaderLen
	IndexParams_HNSWV1_MLen              = 8 // int64
	IndexParams_HNSWV1_EfConstructionOff = IndexParams_HNSWV1_MOff + IndexParams_HNSWV1_MLen
	IndexParams_HNSWV1_EfConstructionLen = 8 // int64
	IndexParams_HNSWV1_EfSearchOff       = IndexParams_HNSWV1_EfConstructionOff + IndexParams_HNSWV1_EfConstructionLen
	IndexParams_HNSWV1_EfSearchLen       = 8 // int64
	IndexParams_HNSWV1_QuantizationOff   = IndexParams_HNSWV1_EfSearchOff + IndexParams_HNSWV1_EfSearchLen
	IndexParams_HNSWV1_QuantizationLen   = 2 // uint16
	IndexParams_HNSWV1_AlgoOff           = IndexParams_HNSWV1_QuantizationOff + IndexParams_HNSWV1_QuantizationLen
	IndexParams_HNSWV1_AlgoLen           = 2 // uint16
	IndexParams_HNSWV1_Size              = IndexParams_HNSWV1_AlgoOff + IndexParams_HNSWV1_AlgoLen
)

func BuildIndexParamsHNSWV1(
	algo IndexParamAlgoType,
	m int64, // optional
	efConstruction int64, // optional
	efSearch int64, // optional
	quantization IndexParamQuantizationType, // optional
) IndexParams {
	buf := make([]byte, IndexParams_HNSWV1_Size)
	copy(buf, IndexParamMagicNumberBuf)
	copy(buf[IndexParams_TypeOff:], IndexParamTypeHNSWV1Buf)
	copy(buf[IndexParams_VersionOff:], types.EncodeFixed[uint16](1))
	copy(buf[IndexParams_HNSWV1_MOff:], types.EncodeFixed(m))
	copy(buf[IndexParams_HNSWV1_EfConstructionOff:], types.EncodeFixed(efConstruction))
	copy(buf[IndexParams_HNSWV1_EfSearchOff:], types.EncodeFixed(efSearch))
	copy(buf[IndexParams_HNSWV1_QuantizationOff:], types.EncodeFixed(uint16(quantization)))
	copy(buf[IndexParams_HNSWV1_AlgoOff:], types.EncodeFixed(uint16(algo)))
	return buf
}

func (params IndexParams) HNSWM() int64 {
	if len(params) < IndexParams_HNSWV1_MOff {
		return 0
	}
	return types.DecodeFixed[int64](params[IndexParams_HNSWV1_MOff:])
}

func (params IndexParams) HNSWEfConstruction() int64 {
	if len(params) < IndexParams_HNSWV1_EfConstructionOff {
		return 0
	}
	return types.DecodeFixed[int64](params[IndexParams_HNSWV1_EfConstructionOff:])
}

func (params IndexParams) HNSWEfSearch() int64 {
	if len(params) < IndexParams_HNSWV1_EfSearchOff {
		return 0
	}
	return types.DecodeFixed[int64](params[IndexParams_HNSWV1_EfSearchOff:])
}

func (params IndexParams) HNSWQuantization() IndexParamQuantizationType {
	if len(params) < IndexParams_HNSWV1_QuantizationOff {
		return IndexParamQuantizationType_Invalid
	}
	return IndexParamQuantizationType(types.DecodeFixed[uint16](params[IndexParams_HNSWV1_QuantizationOff:]))
}

func (params IndexParams) HNSWAlgo() IndexParamAlgoType {
	if len(params) < IndexParams_HNSWV1_AlgoOff {
		return IndexParamAlgoType_Invalid
	}
	return IndexParamAlgoType(types.DecodeFixed[uint16](params[IndexParams_HNSWV1_AlgoOff:]))
}
Logging Consistency

New log keys/messages introduced; verify they align with observability conventions and won’t break log parsing/alerts. Ensure sensitive data like SQL strings are safe to log where added.

// convert the plan's defs to the execution's defs
exeDefs, extra, err := engine.PlanDefsToExeDefs(qry.GetTableDef())
if err != nil {
	c.proc.Error(
		c.proc.Ctx,
		"create-table-get-exe-defs-failed",
		zap.String("databaseName", c.db),
		zap.String("tableName", qry.GetTableDef().GetName()),
		zap.Error(err),
	)
	return err
}

dbName := c.db
if qry.GetDatabase() != "" {
	dbName = qry.GetDatabase()
}
tblName := qry.GetTableDef().GetName()

if err = lockMoDatabase(c, dbName, lock.LockMode_Shared); err != nil {
	c.proc.Error(
		c.proc.Ctx,
		"lock-database-failed",
		zap.String("databaseName", c.db),
		zap.Error(err),
	)
	return err
}

dbSource, err := c.e.Database(c.proc.Ctx, dbName, c.proc.GetTxnOperator())
if err != nil {
	if dbName == "" {
		return moerr.NewNoDB(c.proc.Ctx)
	}
	return convertDBEOB(c.proc.Ctx, err, dbName)
}

exists, err := dbSource.RelationExists(c.proc.Ctx, tblName, nil)
if err != nil {
	c.proc.Error(
		c.proc.Ctx,
		"check-table-relation-exists-failed",
		zap.String("databaseName", c.db),
		zap.String("tableName", tblName),
		zap.Error(err),
	)
	return err
}
if exists {
	if qry.GetIfNotExists() {
		return nil
	}
	return moerr.NewTableAlreadyExists(c.proc.Ctx, tblName)
}

// check in EntireEngine.TempEngine, notice that TempEngine may not init
if c.e.HasTempEngine() {
	var tmpDBSource engine.Database
	if tmpDBSource, err = c.e.Database(
		c.proc.Ctx,
		defines.TEMPORARY_DBNAME,
		c.proc.GetTxnOperator(),
	); err == nil {
		exists, err := tmpDBSource.RelationExists(c.proc.Ctx, engine.GetTempTableName(dbName, tblName), nil)
		if err != nil {
			c.proc.Error(
				c.proc.Ctx,
				"temp-table-exists-check-failed",
				zap.String("db-name", dbName),
				zap.String("table-name", tblName),
				zap.Error(err),
			)
			return err
		}
		if exists {
			if qry.GetIfNotExists() {
				return nil
			}
			return moerr.NewTableAlreadyExists(c.proc.Ctx, fmt.Sprintf("temporary '%s'", tblName))
		}
	}
}

if err = lockMoTable(c, dbName, tblName, lock.LockMode_Exclusive); err != nil {
	c.proc.Error(
		c.proc.Ctx,
		"lock-main-table-failed",
		zap.String("databaseName", c.db),
		zap.String("tableName", qry.GetTableDef().GetName()),
		zap.Error(err),
	)
	return err
}

if len(qry.IndexTables) > 0 {
	for _, def := range qry.IndexTables {
		id, err := c.e.AllocateIDByKey(c.proc.Ctx, "")
		if err != nil {
			return err
		}
		def.TblId = id
		extra.IndexTables = append(extra.IndexTables, id)
	}
}

if err = dbSource.Create(
	context.WithValue(
		c.proc.Ctx,
		defines.SqlKey{}, c.sql,
	),
	tblName,
	append(exeCols, exeDefs...),
); err != nil {
	c.proc.Error(
		c.proc.Ctx,
		"create-main-table-failed",
		zap.String("databaseName", c.db),
		zap.String("tableName", qry.GetTableDef().GetName()),
		zap.Error(err),
	)
	return err
}

//update mo_foreign_keys
for _, sql := range qry.UpdateFkSqls {
	err = c.runSql(sql)
	if err != nil {
		return err
	}
}

// handle fk that refers to others tables
fkDbs := qry.GetFkDbs()
if len(fkDbs) > 0 {
	fkTables := qry.GetFkTables()
	//get the relation of created table above again.
	//due to the colId may be changed.
	newRelation, err := dbSource.Relation(c.proc.Ctx, tblName, nil)
	if err != nil {
		c.proc.Info(
			c.proc.Ctx,
			"createTable",
			zap.String("databaseName", c.db),
			zap.String("tableName", qry.GetTableDef().GetName()),
			zap.Error(err),
		)
		return err
	}
	tblId := newRelation.GetTableID(c.proc.Ctx)

	newTableDef, err := newRelation.TableDefs(c.proc.Ctx)
	if err != nil {
		c.proc.Info(
			c.proc.Ctx,
			"create-table-get-table-def-failed",
			zap.String("databaseName", c.db),
			zap.String("tableName", qry.GetTableDef().GetName()),
			zap.Error(err),
		)
		return err
	}

	oldCt := GetConstraintDefFromTableDefs(newTableDef)
	//get the columnId of the column from newTableDef
	var colNameToId = make(map[string]uint64)
	for _, def := range newTableDef {
		if attr, ok := def.(*engine.AttributeDef); ok {
			colNameToId[strings.ToLower(attr.Attr.Name)] = attr.Attr.ID
		}
	}
	//old colId -> colName
	colId2Name := make(map[uint64]string)
	for _, col := range planCols {
		colId2Name[col.ColId] = col.Name
	}
	dedupFkName := make(plan2.UnorderedSet[string])
	//1. update fk info in child table.
	//column ids of column names in child table have changed after
	//the table is created by engine.Database.Create.
	//refresh column ids of column names in child table.
	newFkeys := make([]*plan.ForeignKeyDef, len(qry.GetTableDef().Fkeys))
	for i, fkey := range qry.GetTableDef().Fkeys {
		if dedupFkName.Find(fkey.Name) {
			return moerr.NewInternalErrorf(c.proc.Ctx, "deduplicate fk name %s", fkey.Name)
		}
		dedupFkName.Insert(fkey.Name)
		newDef := &plan.ForeignKeyDef{
			Name:        fkey.Name,
			Cols:        make([]uint64, len(fkey.Cols)),
			ForeignTbl:  fkey.ForeignTbl,
			ForeignCols: make([]uint64, len(fkey.ForeignCols)),
			OnDelete:    fkey.OnDelete,
			OnUpdate:    fkey.OnUpdate,
		}
		copy(newDef.ForeignCols, fkey.ForeignCols)

		//if it is fk self, the parent table is same as the child table.
		//refresh the ForeignCols also.
		if fkey.ForeignTbl == 0 {
			for j, colId := range fkey.ForeignCols {
				//old colId -> colName
				colName := colId2Name[colId]
				//colName -> new colId
				newDef.ForeignCols[j] = colNameToId[colName]
			}
		}

		//refresh child table column id
		for idx, colName := range qry.GetFkCols()[i].Cols {
			newDef.Cols[idx] = colNameToId[colName]
		}
		newFkeys[i] = newDef
	}
	// remove old fk settings
	newCt, err := MakeNewCreateConstraint(oldCt, &engine.ForeignKeyDef{
		Fkeys: newFkeys,
	})
	if err != nil {
		c.proc.Info(c.proc.Ctx, "createTable",
			zap.String("databaseName", c.db),
			zap.String("tableName", qry.GetTableDef().GetName()),
			zap.Error(err),
		)
		return err
	}
	err = newRelation.UpdateConstraint(c.proc.Ctx, newCt)
	if err != nil {
		c.proc.Info(c.proc.Ctx, "createTable",
			zap.String("databaseName", c.db),
			zap.String("tableName", qry.GetTableDef().GetName()),
			zap.Error(err),
		)
		return err
	}

	//2. need to append TableId to parent's TableDef.RefChildTbls
	for i, fkTableName := range fkTables {
		fkDbName := fkDbs[i]
		fkey := qry.GetTableDef().Fkeys[i]
		if fkey.ForeignTbl == 0 {
			//fk self refer
			//add current table to parent's children table
			err = AddChildTblIdToParentTable(c.proc.Ctx, newRelation, 0)
			if err != nil {
				c.proc.Info(c.proc.Ctx, "createTable",
					zap.String("databaseName", c.db),
					zap.String("tableName", qry.GetTableDef().GetName()),
					zap.Error(err),
				)
				return err
			}
			continue
		}
		fkDbSource, err := c.e.Database(c.proc.Ctx, fkDbName, c.proc.GetTxnOperator())
		if err != nil {
			c.proc.Info(c.proc.Ctx, "createTable",
				zap.String("databaseName", c.db),
				zap.String("tableName", qry.GetTableDef().GetName()),
				zap.Error(err),
			)
			return err
		}
		fkRelation, err := fkDbSource.Relation(c.proc.Ctx, fkTableName, nil)
		if err != nil {
			c.proc.Info(c.proc.Ctx, "createTable",
				zap.String("databaseName", c.db),
				zap.String("tableName", qry.GetTableDef().GetName()),
				zap.Error(err),
			)
			return err
		}
		//add current table to parent's children table
		err = AddChildTblIdToParentTable(c.proc.Ctx, fkRelation, tblId)
		if err != nil {
			c.proc.Info(c.proc.Ctx, "createTable",
				zap.String("databaseName", c.db),
				zap.String("tableName", qry.GetTableDef().GetName()),
				zap.Error(err),
			)
			return err
		}
	}
}

// handle fk forward reference
fkRefersToMe := qry.GetFksReferToMe()
if len(fkRefersToMe) > 0 {
	//1. get the relation of created table above again.
	//get the relation of created table above again.
	//due to the colId may be changed.
	newRelation, err := dbSource.Relation(c.proc.Ctx, tblName, nil)
	if err != nil {
		c.proc.Info(
			c.proc.Ctx,
			"create-table-get-relation-failed",
			zap.String("databaseName", c.db),
			zap.String("tableName", qry.GetTableDef().GetName()),
			zap.Error(err),
		)
		return err
	}
	tblId := newRelation.GetTableID(c.proc.Ctx)

	newTableDef, err := newRelation.TableDefs(c.proc.Ctx)
	if err != nil {
		c.proc.Info(
			c.proc.Ctx,
			"create-table-get-table-def-failed",
			zap.String("databaseName", c.db),
			zap.String("tableName", qry.GetTableDef().GetName()),
			zap.Error(err),
		)
		return err
	}
	//get the columnId of the column from newTableDef
	var colNameToId = make(map[string]uint64)
	for _, def := range newTableDef {
		if attr, ok := def.(*engine.AttributeDef); ok {
			colNameToId[strings.ToLower(attr.Attr.Name)] = attr.Attr.ID
		}
	}
	//1.1 update the column id of the column names in this table.
	//2. update fk info in the child table.
	for _, info := range fkRefersToMe {
		//update foreignCols in fk
		newDef := &plan.ForeignKeyDef{
			Name:        info.Def.Name,
			Cols:        make([]uint64, len(info.Def.Cols)),
			ForeignTbl:  tblId,
			ForeignCols: make([]uint64, len(info.Def.ForeignCols)),
			OnDelete:    info.Def.OnDelete,
			OnUpdate:    info.Def.OnUpdate,
		}
		//child table column ids of the child table
		copy(newDef.Cols, info.Def.Cols)
		//parent table column ids of the parent table
		for j, colReferred := range info.ColsReferred.Cols {
			//colName -> new colId
			if id, has := colNameToId[colReferred]; has {
				newDef.ForeignCols[j] = id
			} else {
				err := moerr.NewInternalErrorf(
					c.proc.Ctx, "no-column-%s", colReferred,
				)
				c.proc.Info(
					c.proc.Ctx,
					"create-table-no-column-failed",
					zap.String("databaseName", c.db),
					zap.String("tableName", qry.GetTableDef().GetName()),
					zap.Error(err),
				)
				return err
			}
		}

		// add the fk def into the child table
		childDb, err := c.e.Database(c.proc.Ctx, info.Db, c.proc.GetTxnOperator())
		if err != nil {
			c.proc.Info(
				c.proc.Ctx,
				"create-table-get-child-db-failed",
				zap.String("databaseName", c.db),
				zap.String("tableName", qry.GetTableDef().GetName()),
				zap.Error(err),
			)
			return err
		}
		childTable, err := childDb.Relation(c.proc.Ctx, info.Table, nil)
		if err != nil {
			c.proc.Info(
				c.proc.Ctx,
				"create-table-get-child-table-failed",
				zap.String("databaseName", c.db),
				zap.String("tableName", qry.GetTableDef().GetName()),
				zap.Error(err),
			)
			return err
		}
		if err = AddFkeyToRelation(c.proc.Ctx, childTable, newDef); err != nil {
			c.proc.Error(
				c.proc.Ctx,
				"create-table-add-fk-to-child-table-failed",
				zap.String("databaseName", c.db),
				zap.String("tableName", qry.GetTableDef().GetName()),
				zap.Error(err),
			)
			return err
		}
		// add the child table id -- tblId into the current table -- refChildDef
		if err = AddChildTblIdToParentTable(
			c.proc.Ctx, newRelation, childTable.GetTableID(c.proc.Ctx),
		); err != nil {
			c.proc.Error(
				c.proc.Ctx,
				"create-table-add-child-table-id-to-parent-table-failed",
				zap.String("databaseName", c.db),
				zap.String("tableName", qry.GetTableDef().GetName()),
				zap.Error(err),
			)
			return err
		}
	}
}

// build index table
main, err := dbSource.Relation(c.proc.Ctx, tblName, nil)
if err != nil {
	c.proc.Error(
		c.proc.Ctx,
		"create-table-get-main-table-failed",
		zap.String("databaseName", c.db),
		zap.String("tableName", qry.GetTableDef().GetName()),
		zap.Error(err),
	)
	return err
}

var indexExtra *api.SchemaExtra
for i, def := range qry.IndexTables {
	planCols = def.GetCols()
	exeCols = engine.PlanColsToExeCols(planCols)
	if exeDefs, indexExtra, err = engine.PlanDefsToExeDefs(def); err != nil {
		c.proc.Error(
			c.proc.Ctx,
			"create-table-get-exe-defs-failed",
			zap.String("databaseName", c.db),
			zap.String("tableName", qry.GetTableDef().GetName()),
			zap.Error(err),
		)
		return err
	}

	exists, err := dbSource.RelationExists(c.proc.Ctx, def.Name, nil)
	if err != nil {
		c.proc.Error(
			c.proc.Ctx,
			"create-table-check-index-relation-exists-failed",
			zap.String("databaseName", c.db),
			zap.String("tableName", def.GetName()),
			zap.Error(err),
		)
		return err
	}
	if exists {
		return moerr.NewTableAlreadyExists(c.proc.Ctx, def.Name)
	}

	def.TblId = extra.IndexTables[i]
	indexExtra.FeatureFlag |= features.IndexTable
	indexExtra.ParentTableID = main.GetTableID(c.proc.Ctx)

	if err = dbSource.Create(
		context.WithValue(c.proc.Ctx, defines.TableIDKey{}, def.TblId),
		def.Name,
		append(exeCols, exeDefs...),
	); err != nil {
		c.proc.Error(
			c.proc.Ctx,
			"create-table-create-index-table-failed",
			zap.String("databaseName", c.db),
			zap.String("tableName", qry.GetTableDef().GetName()),
			zap.Error(err),
		)
		return err
	}

	if err = maybeCreateAutoIncrement(
		c.proc.Ctx,
		c.proc.GetService(),
		dbSource,
		def,
		c.proc.GetTxnOperator(),
		nil,
	); err != nil {
		c.proc.Error(
			c.proc.Ctx,
			"create-table-maybe-create-auto-increment-failed",
			zap.String("databaseName", c.db),
			zap.String("tableName", qry.GetTableDef().GetName()),
			zap.String("index tableName", def.Name),
			zap.Error(err),
		)
		return err
	}

	var initSQL string
	switch def.TableType {
	case catalog.SystemSI_IVFFLAT_TblType_Metadata:
		initSQL = fmt.Sprintf(
			"INSERT INTO `%s`.`%s` (`%s`, `%s`) VALUES('version', '0');",
			qry.Database,
			def.Name,
			catalog.SystemSI_IVFFLAT_TblCol_Metadata_key,
			catalog.SystemSI_IVFFLAT_TblCol_Metadata_val,
		)

	case catalog.SystemSI_IVFFLAT_TblType_Centroids:
		initSQL = fmt.Sprintf(
			"INSERT INTO `%s`.`%s` (`%s`, `%s`, `%s`) VALUES(0,1,NULL);",
			qry.Database,
			def.Name,
			catalog.SystemSI_IVFFLAT_TblCol_Centroids_version,
			catalog.SystemSI_IVFFLAT_TblCol_Centroids_id,
			catalog.SystemSI_IVFFLAT_TblCol_Centroids_centroid,
		)
	}
	if err = c.runSql(initSQL); err != nil {
		c.proc.Error(
			c.proc.Ctx,
			"create-index-table-for-execute-initSQL",
			zap.String("databaseName", c.db),
			zap.String("tableName", qry.GetTableDef().GetName()),
			zap.String("index tableName", def.Name),
			zap.String("initSQL", initSQL),
			zap.Error(err),
		)
		return err
	}
}

if checkIndexInitializable(dbName, tblName) {
	newRelation, err := dbSource.Relation(c.proc.Ctx, tblName, nil)
	if err != nil {
		c.proc.Error(
			c.proc.Ctx,
			"check-indexes-initializable",
			zap.String("databaseName", c.db),
			zap.String("tableName", qry.GetTableDef().GetName()),
			zap.Error(err),
		)
		return err
	}
	if err = s.checkTableWithValidIndexes(c, newRelation); err != nil {
		c.proc.Error(
			c.proc.Ctx,
			"check-table-with-valid-indexes",
			zap.String("databaseName", c.db),
			zap.String("tableName", qry.GetTableDef().GetName()),
			zap.Error(err),
		)
		return err
	}

	var insertSQL string
	if insertSQL, err = makeInsertMultiIndexSQL(
		c.e, c.proc.Ctx, c.proc, dbSource, newRelation,
	); err != nil {
		c.proc.Error(
			c.proc.Ctx,
			"make-insert-multi-index-sql",
			zap.String("databaseName", c.db),
			zap.String("tableName", qry.GetTableDef().GetName()),
			zap.Error(err),
		)
		return err
	}
	if err = c.runSql(insertSQL); err != nil {
		c.proc.Error(
			c.proc.Ctx,
			"run-insert-multi-index-sql",
			zap.String("insertSQL", insertSQL),
			zap.String("dbName0", dbName),
			zap.String("tblName0", tblName),
			zap.String("databaseName", c.db),
			zap.String("tableName", qry.GetTableDef().GetName()),
			zap.Error(err),
		)
		return err
	}
Index Param Serialization

When inserting into mo_indexes, algorithm_params now derive from binary params via ToJsonParamString; confirm round-trip compatibility with consumers expecting JSON and that empty/bad params do not produce ambiguous empty strings.

					fmt.Fprintf(buffer, "'%s', ", algorithm_table_type)

					//8. algorithm_params
					if len(indexDef.IndexAlgoParams) > 0 {
						params := catalog.MustIndexParams(indexDef.IndexAlgoParams)
						fmt.Fprintf(buffer, "'%s', ", params.ToJsonParamString())
					} else {
						fmt.Fprintf(buffer, "'%s', ", EMPTY_STRING)
					}

					// 9. index visible
					fmt.Fprintf(buffer, "%d, ", INDEX_VISIBLE_YES)

					// 10. index vec_hidden
					fmt.Fprintf(buffer, "%d, ", INDEX_HIDDEN_NO)

					// 11. index vec_comment
					fmt.Fprintf(buffer, "'%s', ", indexDef.Comment)

					// 12. index vec_column_name
					fmt.Fprintf(buffer, "'%s', ", getOriginName(part))

					// 13. index vec_ordinal_position
					fmt.Fprintf(buffer, "%d, ", i+1)

					// 14. index vec_options
					if indexDef.Option != nil {
						if indexDef.Option.ParserName != "" {
							fmt.Fprintf(
								buffer,
								"'parser=%s,ngram_token_size=%d', ",
								indexDef.Option.ParserName,
								indexDef.Option.NgramTokenSize,
							)
						}
					} else {
						fmt.Fprintf(buffer, "%s, ", NULL_VALUE)
					}

					// 15. index vec_index_table
					if indexDef.TableExist {
						fmt.Fprintf(buffer, "'%s')", indexDef.IndexTableName)
					} else {
						fmt.Fprintf(buffer, "%s)", NULL_VALUE)
					}
				}
			}
		case *engine.PrimaryKeyDef:
			ctx, cancelFunc := context.WithTimeoutCause(
				proc.Ctx, time.Second*30, moerr.CauseGenInsertMOIndexesSql2,
			)
			index_id, err := eg.AllocateIDByKey(ctx, ALLOCID_INDEX_KEY)
			cancelFunc()
			if err != nil {
				return "", moerr.AttachCause(ctx, err)
			}
			if def.Pkey.PkeyColName != catalog.FakePrimaryKeyColName {
				for i, colName := range def.Pkey.Names {
					//1. index id
					if isFirst {
						fmt.Fprintf(buffer, "(%d, ", index_id)
						isFirst = false
					} else {
						fmt.Fprintf(buffer, ", (%d, ", index_id)
					}

					//2. table_id
					fmt.Fprintf(buffer, "%d, ", tableId)

					// 3. databaseId
					fmt.Fprintf(buffer, "%s, ", databaseId)

					// 4.index.IndexName
					fmt.Fprintf(buffer, "'%s', ", "PRIMARY")

					// 5.index_type
					fmt.Fprintf(buffer, "'%s', ", INDEX_TYPE_PRIMARY)

					//6. algorithm
					fmt.Fprintf(buffer, "'%s', ", EMPTY_STRING)

					//7. algorithm_table_type
					fmt.Fprintf(buffer, "'%s', ", EMPTY_STRING)

					//8. algorithm_params
					fmt.Fprintf(buffer, "'%s', ", EMPTY_STRING)

					//9. index visible
					fmt.Fprintf(buffer, "%d, ", INDEX_VISIBLE_YES)

					// 10. index vec_hidden
					fmt.Fprintf(buffer, "%d, ", INDEX_HIDDEN_NO)

					// 11. index vec_comment
					fmt.Fprintf(buffer, "'%s', ", EMPTY_STRING)

					// 12. index vec_column_name
					fmt.Fprintf(buffer, "'%s', ", getOriginName(colName))

					// 13. index vec_ordinal_position
					fmt.Fprintf(buffer, "%d, ", i+1)

					// 14. index vec_options
					fmt.Fprintf(buffer, "%s, ", NULL_VALUE)

					// 15. index vec_index_table
					fmt.Fprintf(buffer, "%s)", NULL_VALUE)
				}
			}
		}
	}
	buffer.WriteString(";")
	return buffer.String(), nil
}

// makeInsertSingleIndexSQL: make index metadata information sql for a single index object
func makeInsertSingleIndexSQL(eg engine.Engine, proc *process.Process, databaseId string, tableId uint64, idxdef *plan.IndexDef, tableDef *plan.TableDef) (string, error) {
	if idxdef == nil {
		return "", nil
	}
	ct := &engine.ConstraintDef{
		Cts: []engine.Constraint{
			&engine.IndexDef{
				Indexes: []*plan.IndexDef{idxdef},
			},
		},
	}
	insertMoIndexesSql, err := genInsertMOIndexesSql(eg, proc, databaseId, tableId, ct, tableDef)
	if err != nil {
		return "", err
	}
	return insertMoIndexesSql, nil
}

// makeInsertMultiIndexSQL :Synchronize the index metadata information of the table to the index metadata table
func makeInsertMultiIndexSQL(
	eng engine.Engine,
	ctx context.Context,
	proc *process.Process,
	dbSource engine.Database,
	relation engine.Relation,
) (string, error) {
	if dbSource == nil || relation == nil {
		return "", nil
	}
	databaseId := dbSource.GetDatabaseId(ctx)
	tableId := relation.GetTableID(ctx)
	tableDef := relation.GetTableDef(ctx)

	ct, err := GetConstraintDef(ctx, relation)
	if err != nil {
		return "", err
	}
	if ct == nil {
		return "", nil
	}

	hasIndex := false
	for _, constraint := range ct.Cts {
		if idxdef, ok := constraint.(*engine.IndexDef); ok && len(idxdef.Indexes) > 0 {
			hasIndex = true
			break
		}
		if pkdef, ok := constraint.(*engine.PrimaryKeyDef); ok {
			if pkdef.Pkey.PkeyColName != catalog.FakePrimaryKeyColName {
				hasIndex = true
				break
			}
		}
	}
	if !hasIndex {
		return "", nil
	}

	return genInsertMOIndexesSql(
		eng, proc, databaseId, tableId, ct, tableDef,
	)
}

Copy link

You are nearing your monthly Qodo Merge usage quota. For more information, please visit here.

PR Code Suggestions ✨

Explore these optional code suggestions:

CategorySuggestion                                                                                                                                    Impact
Possible issue
Correct buffer bounds checking

Fix length checks to compare against the end offset inclusive of field length.
Using start offset alone can pass for undersized buffers and trigger
out-of-bounds reads.

pkg/catalog/index_params.go [300-305]

 func (params IndexParams) IVFFLATList() int64 {
-	if len(params) < IndexParams_IVFFLATV1_ListOff {
+	end := IndexParams_IVFFLATV1_ListOff + IndexParams_IVFFLATV1_ListLen
+	if len(params) < end {
 		return 0
 	}
 	return types.DecodeFixed[int64](params[IndexParams_IVFFLATV1_ListOff:])
 }
  • Apply / Chat
Suggestion importance[1-10]: 9

__

Why: The suggestion correctly points out a potential out-of-bounds read due to an incorrect length check. The proposed fix of checking against the end of the field (offset + length) is the standard and safe way to prevent such panics.

High
Prevent out-of-bounds in accessors

Apply proper end-of-field bounds checks for all accessors (e.g., HNSWAlgo,
HNSWQuantization, HNSWM, etc.). Current checks use start offsets and can read
past buffer end on malformed inputs.

pkg/catalog/index_params.go [380-385]

 func (params IndexParams) HNSWAlgo() IndexParamAlgoType {
-	if len(params) < IndexParams_HNSWV1_AlgoOff {
+	end := IndexParams_HNSWV1_AlgoOff + IndexParams_HNSWV1_AlgoLen
+	if len(params) < end {
 		return IndexParamAlgoType_Invalid
 	}
 	return IndexParamAlgoType(types.DecodeFixed[uint16](params[IndexParams_HNSWV1_AlgoOff:]))
 }
  • Apply / Chat
Suggestion importance[1-10]: 9

__

Why: The suggestion correctly identifies a potential out-of-bounds read due to an incorrect length check, which is a critical bug. The proposed fix of checking against the end of the field (offset + length) is correct and prevents panics on malformed input.

High
Replace unsafe panic with safe parse

Avoid calling MustIndexParams in non-test code as it can panic on malformed
input. Use a safe conversion that returns an error to keep SHOW CREATE TABLE
robust against bad stored params. Propagate the error instead of crashing the
server.

pkg/sql/plan/build_show_util.go [242-246]

 if indexdef.IndexAlgoParams != "" {
-	paramList, err := catalog.MustIndexParams(indexdef.IndexAlgoParams).ToStringList()
+	params, err := catalog.TryToIndexParams(indexdef.IndexAlgoParams)
+	if err != nil {
+		return "", nil, err
+	}
+	paramList, err := params.ToStringList()
 	if err != nil {
 		return "", nil, err
 	}
  • Apply / Chat
Suggestion importance[1-10]: 9

__

Why: The suggestion correctly identifies that MustIndexParams can cause a panic if the stored parameters are malformed, which would crash the server. Replacing it with TryToIndexParams and proper error handling is a critical robustness improvement.

High
Prevent panic on bad index params

Avoid MustIndexParams here to prevent panic on invalid index params during
metadata write. Use a fallible parser and return the error so DDL can fail
gracefully instead of crashing the process.

pkg/sql/colexec/index_metadata.go [319-330]

 if len(index.IndexAlgoParams) == 0 {
 	err = vector.AppendBytes(vec_algo_params, []byte(""), true, proc.Mp())
 } else {
-	params := catalog.MustIndexParams(index.IndexAlgoParams)
+	params, perr := catalog.TryToIndexParams(index.IndexAlgoParams)
+	if perr != nil {
+		return nil, perr
+	}
 	rawParams := params.ToJsonParamString()
 	err = vector.AppendBytes(
 		vec_algo_params,
 		util.UnsafeStringToBytes(rawParams),
 		false,
 		proc.Mp(),
 	)
 }
  • Apply / Chat
Suggestion importance[1-10]: 9

__

Why: This suggestion correctly points out the use of MustIndexParams in a DDL operation path, which could panic and crash the server on malformed input. The proposed change to use TryToIndexParams and return an error is a critical fix for server stability.

High
Guard against empty JSON params

Validate that newIndexParams.ToJsonParamString() never returns an empty string
before persisting to mo_catalog.mo_indexes. If it does, fall back to
alterIndex.IndexAlgoParams or raise an error to avoid writing invalid params and
corrupting metadata.

pkg/sql/compile/ddl.go [697-706]

-updateSql := fmt.Sprintf(
-							updateMoIndexesAlgoParams,
-							newIndexParams.ToJsonParamString(),
-							oTableDef.TblId,
-							alterIndex.IndexName,
-						)
-						// PXU TODO: Test
-						if err = c.runSql(updateSql); err != nil {
-							return err
-						}
+jsonParams := newIndexParams.ToJsonParamString()
+if jsonParams == "" {
+	return moerr.NewInternalErrorNoCtx("failed to serialize new index params to JSON")
+}
+updateSql := fmt.Sprintf(updateMoIndexesAlgoParams, jsonParams, oTableDef.TblId, alterIndex.IndexName)
+if err = c.runSql(updateSql); err != nil {
+	return err
+}
  • Apply / Chat
Suggestion importance[1-10]: 8

__

Why: The suggestion correctly identifies that newIndexParams.ToJsonParamString() can return an empty string, which could lead to persisting invalid data. Adding a check improves the robustness and prevents potential metadata corruption.

Medium
Validate index table config parsing

Casting arbitrary string to IndexTableCfgV1 without validation risks panics when
accessed. Use the provided safe conversion that validates magic/version and
structure to prevent corrupt config usage. Return an error on invalid JSON or
mismatched version.

pkg/sql/colexec/table_function/ivf_search.go [196-201]

 cfgstr := cfgVec.GetStringAt(0)
 if len(cfgstr) == 0 {
-  err = moerr.NewInternalError(proc.Ctx, "IndexTableConfig is empty")
-  return
+  return moerr.NewInternalError(proc.Ctx, "IndexTableConfig is empty")
 }
-u.tblcfg = vectorindex.IndexTableCfgV1(cfgstr)
+tblcfg, err := vectorindex.TryConvertIndexTableCfgV1(cfgstr, true)
+if err != nil {
+  return err
+}
+u.tblcfg = tblcfg
  • Apply / Chat
Suggestion importance[1-10]: 8

__

Why: This suggestion correctly identifies an unsafe string-to-byte-slice cast that could lead to a panic, and proposes using a safe conversion function that is used elsewhere in the PR, improving robustness.

Medium
Avoid panic on invalid index params

MustIndexParams will panic on invalid IndexAlgoParams, crashing metadata writes.
Replace with a fallible conversion and return a wrapped error so DDL fails
gracefully instead of panicking. Only emit JSON when conversion succeeds.

pkg/sql/compile/util.go [259-264]

 if len(indexDef.IndexAlgoParams) > 0 {
-  params := catalog.MustIndexParams(indexDef.IndexAlgoParams)
+  params, err := catalog.TryConvertToIndexParams(catalog.IndexParamAlgoName(indexDef.IndexAlgoName), indexDef.IndexAlgoParams)
+  if err != nil {
+    return "", moerr.NewInvalidInputf(proc.Ctx, "invalid index algo params: %v", err)
+  }
   fmt.Fprintf(buffer, "'%s', ", params.ToJsonParamString())
 } else {
   fmt.Fprintf(buffer, "'%s', ", EMPTY_STRING)
 }
  • Apply / Chat
Suggestion importance[1-10]: 8

__

Why: The suggestion correctly identifies that using MustIndexParams can lead to a panic with invalid input, and proposes a safer alternative that returns an error, which is a significant improvement for code robustness.

Medium
High-level
Ensure backward compatibility and migration

The PR changes the on-disk/in-catalog representation of index params and index
table config from JSON strings to custom binary blobs, and now writes binary
into plan/index metadata while reading may still expect JSON in external
consumers. Validate and document a safe migration path (online
upgrade/rollback), including version gating, dual-write/dual-read shims, and
compatibility for existing catalogs and cached entries. Without this, clusters
upgrading or tools relying on mo_indexes could break or show inconsistent
behavior.

Examples:

pkg/catalog/index_params.go [815-825]
func TryConvertToIndexParams(
	algo string,
	paramStr string,
) (params IndexParams, err error) {
	params = IndexParams(util.UnsafeStringToBytes(paramStr))
	if params.IsEmpty() || params.IsValid() {
		return
	}
	params, err = IndexAlgoJsonParamStringToIndexParams(algo, paramStr)
	return

 ... (clipped 1 lines)
pkg/vm/engine/disttae/cache/catalog.go [834-852]
	for _, index := range indexes {
		if index.IndexAlgo != "" {
			params, err := catalog.TryConvertToIndexParams(
				index.IndexAlgo,
				index.IndexAlgoParams,
			)
			if err != nil {
				logutil.Fatal(
					"invalid-algo-param",
					zap.String("table", tblItem.Name),

 ... (clipped 9 lines)

Solution Walkthrough:

Before:

// Simplified logic assuming only one format (e.g., JSON)
func process_index_params(json_params string) {
    var params map[string]string
    json.Unmarshal([]byte(json_params), &params)
    // ... logic using map
}

After:

// Logic to handle both binary and JSON formats
func process_index_params(param_string string) {
    // Try parsing as new binary format first
    binary_params := IndexParams(param_string)
    if binary_params.IsValid() {
        // ... logic using binary_params accessor methods
        return
    }

    // Fallback to parsing as old JSON format
    var params map[string]string
    json.Unmarshal([]byte(param_string), &params)
    // ... logic using map
}
Suggestion importance[1-10]: 8

__

Why: The suggestion raises a critical architectural concern about backward compatibility and data migration, which is vital when changing data formats and is not explicitly addressed in the PR description.

Medium
General
Add safe map lookup guard

Guard the map lookup on metric.DistFuncOpTypes to avoid a false positive when
the function name is not present. Without the ok check, an empty string
comparison may incorrectly pass for invalid inputs.

pkg/sql/plan/apply_indices_hnsw.go [57-68]

 params, err := catalog.TryToIndexParams(idxdef.IndexAlgoParams)
 if err != nil {
 	return false
 }
 if !params.HNSWAlgo().IsValid() {
 	return false
 }
-
 optype := params.HNSWAlgo().String()
-if optype != metric.DistFuncOpTypes[distFnExpr.Func.ObjName] {
+distFn, ok := metric.DistFuncOpTypes[distFnExpr.Func.ObjName]
+if !ok {
+	return false
+}
+if optype != distFn {
 	return false
 }
  • Apply / Chat
Suggestion importance[1-10]: 8

__

Why: The suggestion correctly identifies an unsafe map lookup on metric.DistFuncOpTypes. Accessing a non-existent key would return a zero value (empty string), which could lead to incorrect behavior. Adding a check for key existence makes the logic more robust and correct.

Medium
  • More

@XuPeng-SH XuPeng-SH requested a review from cpegeric August 19, 2025 10:34
Copy link
Contributor

@cpegeric cpegeric left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

think carefully. This hardcoded configuration is no way back method...

Copy link
Contributor

@fengttt fengttt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I assume this PR is about the json parser is too slow. The official go json parser is indeed very slow. But there are many json parsers that is much much faster.

There is simdjson, and sonic from bytedance. Actually I think we already used segmentio.

Do we know if say, segmentio or sonic will be good enough?

@XuPeng-SH
Copy link
Contributor Author

I assume this PR is about the json parser is too slow. The official go json parser is indeed very slow. But there are many json parsers that is much much faster.

There is simdjson, and sonic from bytedance. Actually I think we already used segmentio.

Do we know if say, segmentio or sonic will be good enough?

Tried sonic, performance was twice as fast as Go's json, but still slow. Tried simdjson, but it was quite troublesome—maybe I didn't use it correctly—performance wasn't great

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/enhancement Review effort 4/5 size/XXL Denotes a PR that changes 2000+ lines
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants