Skip to content

Switch to non-recursive on heap virtual stack when building logical plan from SQL expression#6360

Merged
alamb merged 5 commits intoapache:mainfrom
aprimadi:parse-expr-nonrecursive
May 18, 2023
Merged

Switch to non-recursive on heap virtual stack when building logical plan from SQL expression#6360
alamb merged 5 commits intoapache:mainfrom
aprimadi:parse-expr-nonrecursive

Conversation

@aprimadi
Copy link
Contributor

@aprimadi aprimadi commented May 16, 2023

Which issue does this PR close?

Closes #6040
Closes #1444
Closes #1434

Rationale for this change

Not sure why anyone would want more than 256 expressions but I thought I would write the iterative version just for fun. Now users can write all the expressions they want (or at least until the generated Expr can no longer fit in the stack).

Kudos to @alamb and @houqp for suggesting this iterative version.

What changes are included in this PR?

  • Iterative version of expr logical plan builder
  • Add tests verifying the number of expressions that can be handled is now larger

Are these changes tested?

Yes. Tested by existing tests. Also added a few stack overflow tests.

Are there any user-facing changes?

No

@github-actions github-actions bot added the sql SQL Planner label May 16, 2023
@aprimadi aprimadi marked this pull request as ready for review May 16, 2023 12:27
@aprimadi aprimadi changed the title Switch to non-recursive on heap virtual stack to parse expr Switch to non-recursive on heap virtual stack when building logical plan from SQL expression May 16, 2023
Copy link
Contributor

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you very much @aprimadi . This is really neat.

I guess the next level would be to thread the stack machine down to parsing all expressions (not just binary ops) which while would make the code more uniform I am not sure it would make it easier to understand

Not sure why anyone would want more than 256 expressions

LOL I think this kind of expression is generated by other programs / tools like BI frontends

};
}

test_stack_overflow!(64);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I verified that without the code in this PR, the tests fail as expected:

running 14 tests
test expr::arrow_cast::test::test_parse_data_type_whitespace_tolerance ... ok�(B
test expr::identifier::test::test_form_identifier ... ok�(B
test expr::identifier::test::test_generate_schema_search_terms ... ok�(B
test expr::arrow_cast::test::parse_data_type_errors ... ok�(B
test expr::arrow_cast::test::test_parse_data_type ... ok�(B
test parser::tests::create_external_table ... ok�(B
test expr::tests::test_stack_overflow_64 ... ok�(B
test expr::tests::test_stack_overflow_128 ... ok�(B
test expr::tests::test_stack_overflow_256 ... ok�(B
test expr::tests::test_stack_overflow_512 ... ok�(B

thread 'expr::tests::test_stack_overflow_1024' has overflowed its stack
fatal runtime error: stack overflow
error: test failed, to rerun pass `-p datafusion-sql --lib`

Caused by:
  process didn't exit successfully: `/Users/alamb/Software/target-df2/debug/deps/datafusion_sql-8a77caa8d397a04b` (signal: 6, SIGABRT: process abort signal)

aprimadi and others added 2 commits May 18, 2023 10:27
Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>
Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>
@alamb
Copy link
Contributor

alamb commented May 18, 2023

Thanks again @aprimadi

@alamb alamb merged commit 7c6b41a into apache:main May 18, 2023
@aprimadi
Copy link
Contributor Author

Thank you @alamb for the review

@aprimadi aprimadi deleted the parse-expr-nonrecursive branch May 19, 2023 05:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

sql SQL Planner

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Stack overflow while generating logical plan from statement Avoid stack overflow in Datafusion Query with 100 OR conditions overflows stack

2 participants