Skip to content

24.8 Backport of #79969 - Fix match(col, '^…') index analysis with escaped metacharacters to avoid wrong results and crashes #850

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 3 commits into
base: customizations/24.8.14
Choose a base branch
from

Conversation

arthurpassos
Copy link
Collaborator

Changelog category (leave one):

  • Bug Fix (user-visible misbehavior in an official stable release)

Changelog entry (a user-readable short description of the changes that goes to CHANGELOG.md):

Previously, hitting an unsupported “\x” escape in ^… patterns would do pos = end but still run a stray ++pos, so the code walked past the regex buffer, pulled in garbage bytes (the “�i” etc), and could OOM.

Now that stray increment is removed and all single-char escapes are handled correctly.

Backports are needed.

See example of the problem in fiddle: https://fiddle.clickhouse.com/ef60eb3a-8613-4127-b5b4-bb62fc274748

Another symptom is a very quick memory allocations (up to OOM) with a traces like below:

std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>::push_back(char)
DB::KeyCondition::$_20::__invoke(DB::KeyCondition::RPNElement&, DB::Field const&)
DB::KeyCondition::extractAtomFromTree(DB::RPNBuilderTreeNode const&, DB::KeyCondition::RPNElement&)
DB::RPNBuilder<DB::KeyCondition::RPNElement>::traverseTree(DB::RPNBuilderTreeNode const&)
DB::RPNBuilder<DB::KeyCondition::RPNElement>::traverseTree(DB::RPNBuilderTreeNode const&)
DB::KeyCondition::KeyCondition(DB::ActionsDAG const*, std::__1::shared_ptr<DB::Context const>, std::__1::vector<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>, std::__1::allocator<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>>> const&, std::__1::shared_ptr<DB::ExpressionActions> const&, bool)
DB::buildIndexes(std::__1::optional<DB::ReadFromMergeTree::Indexes>&, DB::ActionsDAG const*, DB::MergeTreeData const&, std::__1::vector<std::__1::shared_ptr<DB::IMergeTreeDataPart const>, std::__1::allocator<std::__1::shared_ptr<DB::IMergeTreeDataPart const>>> const&, std::__1::shared_ptr<DB::MergeTreeData::IMutationsSnapshot const> const&, std::__1::shared_ptr<DB::Context const> const&, DB::SelectQueryInfo const&, std::__1::shared_ptr<DB::StorageInMemoryMetadata const> const&, std::__1::shared_ptr<Poco::Logger> const&)
DB::ReadFromMergeTree::applyFilters(DB::ActionDAGNodes)
DB::QueryPlanOptimizations::optimizePrimaryKeyConditionAndLimit(std::__1::vector<DB::QueryPlanOptimizations::Frame, std::__1::allocator<DB::QueryPlanOptimizations::Frame>> const&)
DB::QueryPlanOptimizations::optimizeTreeSecondPass(DB::QueryPlanOptimizationSettings const&, DB::QueryPlan::Node&, std::__1::list<DB::QueryPlan::Node, std::__1::allocator<DB::QueryPlan::Node>>&)
DB::QueryPlan::optimize(DB::QueryPlanOptimizationSettings const&)
DB::QueryPlan::buildQueryPipeline(DB::QueryPlanOptimizationSettings const&, DB::BuildQueryPipelineSettings const&)
DB::InterpreterSelectQueryAnalyzer::buildQueryPipeline()
DB::InterpreterSelectQueryAnalyzer::execute()
DB::executeQueryImpl(char const*, char const*, std::__1::shared_ptr<DB::Context>, DB::QueryFlags, DB::QueryProcessingStage::Enum, DB::ReadBuffer*)
DB::executeQuery(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>> const&, std::__1::shared_ptr<DB::Context>, DB::QueryFlags, DB::QueryProcessingStage::Enum)
DB::TCPHandler::runImpl()
DB::TCPHandler::run()
Poco::Net::TCPServerConnection::start()
Poco::Net::TCPServerDispatcher::run()
Poco::PooledThread::run()
Poco::ThreadImpl::runnableEntry(void*)

Documentation entry for user-facing changes

...

Exclude tests:

  • Fast test
  • Integration Tests
  • Stateless tests
  • Stateful tests
  • Performance tests
  • All with ASAN
  • All with TSAN
  • All with MSAN
  • All with UBSAN
  • All with Coverage
  • All with Aarch64
  • All Regression
  • Disable CI Cache

filimonov and others added 3 commits June 10, 2025 13:13
- Change loop to `while (pos < end)` to guard against overruns
- Support all single‐char escapes: `\]`, `\}`, `\-`, `\\` alongside existing metachars
- Move `++pos` into the supported‐escape case only, removing stray increments
- Prevent garbage reads past `end` and ensure correct fixed‐prefix for escaped metacharacters
Co-authored-by: Pervakov Grigorii <pervakov.grigory@gmail.com>
@altinity-robot
Copy link
Collaborator

altinity-robot commented Jun 10, 2025

This is an automated comment for commit 8aee8d5 with description of existing statuses. It's updated for the latest CI running

❌ Click here to open a full report in a separate page

Check nameDescriptionStatus
AST fuzzerRuns randomly generated queries to catch program errors. The build type is optionally given in parenthesis. If it fails, ask a maintainer for help❌ failure
Integration testsThe integration tests report. In parenthesis the package type is given, and in square brackets are the optional part/total tests❌ error
Regression aarch64 LDAP authenticationThere's no description for the check yet, please add it to tests/ci/ci_config.py:CHECK_DESCRIPTIONS❌ failure
Regression aarch64 LDAP external_user_directoryThere's no description for the check yet, please add it to tests/ci/ci_config.py:CHECK_DESCRIPTIONS❌ failure
Sign aarch64There's no description for the check yet, please add it to tests/ci/ci_config.py:CHECK_DESCRIPTIONS❌ error
Sign releaseThere's no description for the check yet, please add it to tests/ci/ci_config.py:CHECK_DESCRIPTIONS❌ error
Stateful testsRuns stateful functional tests for ClickHouse binaries built in various configurations -- release, debug, with sanitizers, etc❌ error
Stateless testsRuns stateless functional tests for ClickHouse binaries built in various configurations -- release, debug, with sanitizers, etc❌ failure
Successful checks
Check nameDescriptionStatus
BuildsThere's no description for the check yet, please add it to tests/ci/ci_config.py:CHECK_DESCRIPTIONS✅ success
Compatibility checkChecks that clickhouse binary runs on distributions with old libc versions. If it fails, ask a maintainer for help✅ success
Docker keeper imageThe check to build and optionally push the mentioned image to docker hub✅ success
Docker server imageThe check to build and optionally push the mentioned image to docker hub✅ success
Grype Scan altinityinfra/clickhouse-keeper:850-24.8.14.10045.altinitytestThere's no description for the check yet, please add it to tests/ci/ci_config.py:CHECK_DESCRIPTIONS✅ success
Grype Scan altinityinfra/clickhouse-server:850-24.8.14.10045.altinitytest-alpineThere's no description for the check yet, please add it to tests/ci/ci_config.py:CHECK_DESCRIPTIONS✅ success
Grype Scan altinityinfra/clickhouse-server:850-24.8.14.10045.altinitytestThere's no description for the check yet, please add it to tests/ci/ci_config.py:CHECK_DESCRIPTIONS✅ success
Install packagesChecks that the built packages are installable in a clear environment✅ success
Ready for releaseThere's no description for the check yet, please add it to tests/ci/ci_config.py:CHECK_DESCRIPTIONS✅ success
Regression aarch64 Alter move partitionThere's no description for the check yet, please add it to tests/ci/ci_config.py:CHECK_DESCRIPTIONS✅ success
Regression aarch64 Benchmark aws_s3There's no description for the check yet, please add it to tests/ci/ci_config.py:CHECK_DESCRIPTIONS✅ success
Regression aarch64 Benchmark gcsThere's no description for the check yet, please add it to tests/ci/ci_config.py:CHECK_DESCRIPTIONS✅ success
Regression aarch64 Benchmark minioThere's no description for the check yet, please add it to tests/ci/ci_config.py:CHECK_DESCRIPTIONS✅ success
Regression aarch64 Clickhouse Keeper SSLThere's no description for the check yet, please add it to tests/ci/ci_config.py:CHECK_DESCRIPTIONS✅ success
Regression aarch64 LDAP role_mappingThere's no description for the check yet, please add it to tests/ci/ci_config.py:CHECK_DESCRIPTIONS✅ success
Regression aarch64 Parquet aws_s3There's no description for the check yet, please add it to tests/ci/ci_config.py:CHECK_DESCRIPTIONS✅ success
Regression aarch64 Parquet minioThere's no description for the check yet, please add it to tests/ci/ci_config.py:CHECK_DESCRIPTIONS✅ success
Regression aarch64 ParquetThere's no description for the check yet, please add it to tests/ci/ci_config.py:CHECK_DESCRIPTIONS✅ success
Regression aarch64 S3 gcsThere's no description for the check yet, please add it to tests/ci/ci_config.py:CHECK_DESCRIPTIONS✅ success
Regression aarch64 Tiered Storage minioThere's no description for the check yet, please add it to tests/ci/ci_config.py:CHECK_DESCRIPTIONS✅ success
Regression aarch64 Tiered Storage s3amazonThere's no description for the check yet, please add it to tests/ci/ci_config.py:CHECK_DESCRIPTIONS✅ success
Regression aarch64 Tiered Storage s3gcsThere's no description for the check yet, please add it to tests/ci/ci_config.py:CHECK_DESCRIPTIONS✅ success
Regression aarch64 aes_encryptionThere's no description for the check yet, please add it to tests/ci/ci_config.py:CHECK_DESCRIPTIONS✅ success
Regression aarch64 atomic_insertThere's no description for the check yet, please add it to tests/ci/ci_config.py:CHECK_DESCRIPTIONS✅ success
Regression aarch64 base_58There's no description for the check yet, please add it to tests/ci/ci_config.py:CHECK_DESCRIPTIONS✅ success
Regression aarch64 clickhouse_keeperThere's no description for the check yet, please add it to tests/ci/ci_config.py:CHECK_DESCRIPTIONS✅ success
Regression aarch64 data_typesThere's no description for the check yet, please add it to tests/ci/ci_config.py:CHECK_DESCRIPTIONS✅ success
Regression aarch64 datetime64_extended_rangeThere's no description for the check yet, please add it to tests/ci/ci_config.py:CHECK_DESCRIPTIONS✅ success
Regression aarch64 disk_level_encryptionThere's no description for the check yet, please add it to tests/ci/ci_config.py:CHECK_DESCRIPTIONS✅ success
Regression aarch64 dnsThere's no description for the check yet, please add it to tests/ci/ci_config.py:CHECK_DESCRIPTIONS✅ success
Regression aarch64 enginesThere's no description for the check yet, please add it to tests/ci/ci_config.py:CHECK_DESCRIPTIONS✅ success
Regression aarch64 exampleThere's no description for the check yet, please add it to tests/ci/ci_config.py:CHECK_DESCRIPTIONS✅ success
Regression aarch64 extended_precision_data_typesThere's no description for the check yet, please add it to tests/ci/ci_config.py:CHECK_DESCRIPTIONS✅ success
Regression aarch64 kafkaThere's no description for the check yet, please add it to tests/ci/ci_config.py:CHECK_DESCRIPTIONS✅ success
Regression aarch64 kerberosThere's no description for the check yet, please add it to tests/ci/ci_config.py:CHECK_DESCRIPTIONS✅ success
Regression aarch64 key_valueThere's no description for the check yet, please add it to tests/ci/ci_config.py:CHECK_DESCRIPTIONS✅ success
Regression aarch64 memoryThere's no description for the check yet, please add it to tests/ci/ci_config.py:CHECK_DESCRIPTIONS✅ success
Regression aarch64 part_moves_between_shardsThere's no description for the check yet, please add it to tests/ci/ci_config.py:CHECK_DESCRIPTIONS✅ success
Regression aarch64 rbacThere's no description for the check yet, please add it to tests/ci/ci_config.py:CHECK_DESCRIPTIONS✅ success
Regression aarch64 selectsThere's no description for the check yet, please add it to tests/ci/ci_config.py:CHECK_DESCRIPTIONS✅ success
Regression aarch64 session_timezoneThere's no description for the check yet, please add it to tests/ci/ci_config.py:CHECK_DESCRIPTIONS✅ success
Regression aarch64 tiered_storageThere's no description for the check yet, please add it to tests/ci/ci_config.py:CHECK_DESCRIPTIONS✅ success
Regression aarch64 window_functionsThere's no description for the check yet, please add it to tests/ci/ci_config.py:CHECK_DESCRIPTIONS✅ success
Regression release Alter move partitionThere's no description for the check yet, please add it to tests/ci/ci_config.py:CHECK_DESCRIPTIONS✅ success
Regression release Alter replace partitionThere's no description for the check yet, please add it to tests/ci/ci_config.py:CHECK_DESCRIPTIONS✅ success
Regression release Benchmark aws_s3There's no description for the check yet, please add it to tests/ci/ci_config.py:CHECK_DESCRIPTIONS✅ success
Regression release Benchmark gcsThere's no description for the check yet, please add it to tests/ci/ci_config.py:CHECK_DESCRIPTIONS✅ success
Regression release Benchmark minioThere's no description for the check yet, please add it to tests/ci/ci_config.py:CHECK_DESCRIPTIONS✅ success
Regression release Clickhouse Keeper SSLThere's no description for the check yet, please add it to tests/ci/ci_config.py:CHECK_DESCRIPTIONS✅ success
Regression release LDAP authenticationThere's no description for the check yet, please add it to tests/ci/ci_config.py:CHECK_DESCRIPTIONS✅ success
Regression release LDAP external_user_directoryThere's no description for the check yet, please add it to tests/ci/ci_config.py:CHECK_DESCRIPTIONS✅ success
Regression release LDAP role_mappingThere's no description for the check yet, please add it to tests/ci/ci_config.py:CHECK_DESCRIPTIONS✅ success
Regression release Parquet aws_s3There's no description for the check yet, please add it to tests/ci/ci_config.py:CHECK_DESCRIPTIONS✅ success
Regression release Parquet minioThere's no description for the check yet, please add it to tests/ci/ci_config.py:CHECK_DESCRIPTIONS✅ success
Regression release ParquetThere's no description for the check yet, please add it to tests/ci/ci_config.py:CHECK_DESCRIPTIONS✅ success
Regression release S3 gcsThere's no description for the check yet, please add it to tests/ci/ci_config.py:CHECK_DESCRIPTIONS✅ success
Regression release Tiered Storage minioThere's no description for the check yet, please add it to tests/ci/ci_config.py:CHECK_DESCRIPTIONS✅ success
Regression release Tiered Storage s3amazonThere's no description for the check yet, please add it to tests/ci/ci_config.py:CHECK_DESCRIPTIONS✅ success
Regression release Tiered Storage s3gcsThere's no description for the check yet, please add it to tests/ci/ci_config.py:CHECK_DESCRIPTIONS✅ success
Regression release aes_encryptionThere's no description for the check yet, please add it to tests/ci/ci_config.py:CHECK_DESCRIPTIONS✅ success
Regression release atomic_insertThere's no description for the check yet, please add it to tests/ci/ci_config.py:CHECK_DESCRIPTIONS✅ success
Regression release base_58There's no description for the check yet, please add it to tests/ci/ci_config.py:CHECK_DESCRIPTIONS✅ success
Regression release data_typesThere's no description for the check yet, please add it to tests/ci/ci_config.py:CHECK_DESCRIPTIONS✅ success
Regression release datetime64_extended_rangeThere's no description for the check yet, please add it to tests/ci/ci_config.py:CHECK_DESCRIPTIONS✅ success
Regression release disk_level_encryptionThere's no description for the check yet, please add it to tests/ci/ci_config.py:CHECK_DESCRIPTIONS✅ success
Regression release dnsThere's no description for the check yet, please add it to tests/ci/ci_config.py:CHECK_DESCRIPTIONS✅ success
Regression release enginesThere's no description for the check yet, please add it to tests/ci/ci_config.py:CHECK_DESCRIPTIONS✅ success
Regression release exampleThere's no description for the check yet, please add it to tests/ci/ci_config.py:CHECK_DESCRIPTIONS✅ success
Regression release extended_precision_data_typesThere's no description for the check yet, please add it to tests/ci/ci_config.py:CHECK_DESCRIPTIONS✅ success
Regression release kafkaThere's no description for the check yet, please add it to tests/ci/ci_config.py:CHECK_DESCRIPTIONS✅ success
Regression release kerberosThere's no description for the check yet, please add it to tests/ci/ci_config.py:CHECK_DESCRIPTIONS✅ success
Regression release key_valueThere's no description for the check yet, please add it to tests/ci/ci_config.py:CHECK_DESCRIPTIONS✅ success
Regression release lightweight_deleteThere's no description for the check yet, please add it to tests/ci/ci_config.py:CHECK_DESCRIPTIONS✅ success
Regression release memoryThere's no description for the check yet, please add it to tests/ci/ci_config.py:CHECK_DESCRIPTIONS✅ success
Regression release part_moves_between_shardsThere's no description for the check yet, please add it to tests/ci/ci_config.py:CHECK_DESCRIPTIONS✅ success
Regression release rbacThere's no description for the check yet, please add it to tests/ci/ci_config.py:CHECK_DESCRIPTIONS✅ success
Regression release selectsThere's no description for the check yet, please add it to tests/ci/ci_config.py:CHECK_DESCRIPTIONS✅ success
Regression release session_timezoneThere's no description for the check yet, please add it to tests/ci/ci_config.py:CHECK_DESCRIPTIONS✅ success
Regression release tiered_storageThere's no description for the check yet, please add it to tests/ci/ci_config.py:CHECK_DESCRIPTIONS✅ success
Regression release window_functionsThere's no description for the check yet, please add it to tests/ci/ci_config.py:CHECK_DESCRIPTIONS✅ success
Stress testRuns stateless functional tests concurrently from several clients to detect concurrency-related errors✅ success

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants