Skip to content

Feature Limitation: Correlated subquery depth limitation (only 1 level supported) #23110

@dengn

Description

@dengn

Issue: Correlated Subquery Depth Limitation in Boundary Cases

Description

MatrixOne has a strict limitation on the depth of correlated subqueries (IN and EXISTS). When a correlated subquery exceeds 1 level of depth, MatrixOne returns an error indicating that deeper levels will be supported in future versions. This limitation affects various query patterns, especially in boundary test cases.

Error Message

ERROR 20102 (HY000): correlated columns in EXISTS subquery deeper than 1 level will be supported in future version is not yet implemented
ERROR 20102 (HY000): correlated columns in IN subquery deeper than 1 level will be supported in future version is not yet implemented

Affected Cases

This issue affects 2 test cases from our boundary testing on the industry_radar_test database.

Related Table DDL

Table: company

CREATE TABLE `company` (
  `id` int NOT NULL,
  `full_name` varchar(200) DEFAULT NULL,
  `short_name` varchar(128) DEFAULT NULL,
  `update_time` datetime NOT NULL DEFAULT current_timestamp() ON UPDATE current_timestamp() COMMENT '最后修改时间',
  `address` varchar(255) DEFAULT NULL,
  `board_id` int DEFAULT NULL,
  `business_scope` text DEFAULT NULL,
  `city` varchar(32) DEFAULT NULL,
  `company_profile` text DEFAULT NULL,
  `company_type_tags` varchar(255) DEFAULT NULL,
  `country` varchar(20) DEFAULT NULL,
  `directory_label` varchar(512) DEFAULT NULL,
  `district` varchar(32) DEFAULT NULL,
  `email` varchar(255) DEFAULT NULL,
  `english_name` varchar(255) DEFAULT NULL,
  `enterprises_type` varchar(50) DEFAULT NULL,
  `founded_time` datetime(6) DEFAULT NULL,
  `industry` varchar(32) DEFAULT NULL,
  `industry_paqu` varchar(1000) DEFAULT NULL,
  `insured_num` varchar(32) DEFAULT NULL,
  `product_label` varchar(512) DEFAULT NULL,
  `province` varchar(32) DEFAULT NULL,
  `registered_capital` varchar(50) DEFAULT NULL,
  `representative` varchar(32) DEFAULT NULL,
  `taxpayer_num` varchar(50) DEFAULT NULL,
  `telephone` varchar(255) DEFAULT NULL,
  `website` varchar(500) DEFAULT NULL,
  `industry_web_label` varchar(512) DEFAULT NULL COMMENT '国标行业标签',
  `industry_domain_label` varchar(512) DEFAULT NULL COMMENT '素问产业链标签',
  `scale` varchar(255) DEFAULT NULL COMMENT '规模标签',
  `policy_type` varchar(512) DEFAULT NULL COMMENT '政策类型标签',
  PRIMARY KEY (`id`),
  UNIQUE KEY `company_full_name_idx` (`full_name`),
  UNIQUE KEY `company_short_name_idx` (`short_name`),
  KEY `company_board_id_idx` (`board_id`),
  KEY `company_city_idx` (`city`),
  KEY `company_province_idx` (`province`)
)

Failing SQL Statements

Example 1: EXISTS Subquery with 2-Level Depth

SELECT * FROM company c1 WHERE EXISTS (
    SELECT 1 FROM company c2 WHERE EXISTS (
        SELECT 1 FROM company c3 WHERE c3.id = c2.id AND c2.id = c1.id
    )
) LIMIT 10;

Error: ERROR 20102 (HY000): correlated columns in EXISTS subquery deeper than 1 level will be supported in future version is not yet implemented

Issue: The EXISTS subquery has 2 levels of correlation:

  • Level 1: c2.id = c1.id (c2 correlated to c1)
  • Level 2: c3.id = c2.id (c3 correlated to c2, which is already correlated to c1)

Example 2: IN Subquery with 2-Level Depth

SELECT * FROM company c1 WHERE c1.id IN (
    SELECT c2.id FROM company c2 WHERE c2.id IN (
        SELECT c3.id FROM company c3 WHERE c3.id = c2.id AND c2.id = c1.id
    )
) LIMIT 10;

Error: ERROR 20102 (HY000): correlated columns in IN subquery deeper than 1 level will be supported in future version is not yet implemented

Issue: Similar to EXISTS, the IN subquery has 2 levels of correlation.

Expected Behavior

In MySQL, correlated subqueries with multiple levels of depth are fully supported. The query planner should be able to handle nested correlations and execute them correctly.

Example MySQL behavior:

-- MySQL: Fully supported
SELECT * FROM table1 t1 WHERE EXISTS (
    SELECT 1 FROM table2 t2 WHERE EXISTS (
        SELECT 1 FROM table3 t3 WHERE t3.id = t2.id AND t2.id = t1.id
    )
);

Impact

  • MySQL Compatibility: Multi-level correlated subqueries are a standard MySQL feature
  • Migration: Applications using deep correlated subqueries will fail when migrating to MatrixOne
  • Query Flexibility: Users cannot write complex queries with nested correlations
  • Analytical Queries: Complex analytical queries requiring multi-level correlations are not possible

Use Cases

Multi-level correlated subqueries are commonly used for:

  1. Complex Filtering: Filtering based on conditions that require multiple levels of correlation
  2. Hierarchical Queries: Querying hierarchical data with multiple levels of relationships
  3. Analytical Queries: Complex analytical queries that require nested correlations
  4. Data Validation: Validating data across multiple related tables

Current Limitation

MatrixOne currently supports only 1 level of correlation depth:

  • ✅ Supported: SELECT * FROM t1 WHERE EXISTS (SELECT 1 FROM t2 WHERE t2.id = t1.id)
  • ❌ Not Supported: SELECT * FROM t1 WHERE EXISTS (SELECT 1 FROM t2 WHERE EXISTS (SELECT 1 FROM t3 WHERE t3.id = t2.id AND t2.id = t1.id))

Test Context

This issue was discovered during boundary testing using the industry_radar_test database. The test cases attempted to use 2-level deep correlated subqueries, which should be valid MySQL syntax.

Related Issues

This is related to but distinct from:

This issue specifically focuses on the depth limitation of correlated subqueries in WHERE clauses (IN and EXISTS).

Suggested Fix

  1. Increase Depth Limit: Support at least 2-3 levels of correlation depth (ideally unlimited, limited by system resources)
  2. Query Planner: Update the query planner to handle multi-level correlations
  3. Execution Engine: Implement execution engine support for nested correlated subqueries
  4. Performance Optimization: Optimize execution of deep correlated subqueries
  5. Error Messages: If depth limits are necessary, provide clear error messages indicating:
    • Current depth limit
    • Requested depth
    • Suggestions for rewriting the query
  6. Documentation: Document the correlation depth limit and provide examples of supported patterns

Workaround

As a temporary workaround, users can:

  1. Flatten Queries: Rewrite multi-level correlations as JOINs or CTEs
  2. Application-Level Processing: Use application-level logic to handle multi-level correlations
  3. Temporary Tables: Use temporary tables to break down the correlation levels

However, these workarounds:

  • May require significant query rewriting
  • May be less efficient
  • May not be equivalent in all cases

Related MySQL Documentation

Metadata

Metadata

Assignees

Labels

kind/featurepriority/p1Medium priority feature that should be implemented in this version
No fields configured for Feature.

Projects

No projects

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions