Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add stacker and recursive #13310

Merged
merged 8 commits into from
Nov 11, 2024
Merged

Conversation

peter-toth
Copy link
Contributor

@peter-toth peter-toth commented Nov 8, 2024

Which issue does this PR close?

Closes #9373.

Rationale for this change

This PR utilizes stacker and recursive crates to fix all recursion related issues that came up with the example given in #9373.

stacker seems like a very mature stack growth library used in many projects. It is owned by the rust-lang team and even the Rust compiler itself uses it: https://github.com/rust-lang/rust/blob/master/compiler/rustc_data_structures/src/stack.rs#L14-L22.

recursive is just a convenience layer over stacker to allow using it even simpler, seems to be created by the Polars guys.

What changes are included in this PR?

This PR adds stacker and recursive as new dependencies and mark recursive SQL to logical plan conversion functions, TreeNode recursive APIs and some recursive custom optimizer rules with #[recursive].

Are these changes tested?

Yes, with existing UTs and with the example provided in #9373.

Are there any user-facing changes?

No.

@github-actions github-actions bot added sql SQL Planner logical-expr Logical plan and expressions optimizer Optimizer rules common Related to common crate labels Nov 8, 2024
@blaginin
Copy link
Contributor

blaginin commented Nov 8, 2024

Do you also want to add a test with some large tree?

@peter-toth
Copy link
Contributor Author

Cherry-picked the test from your #13177 if you don't mind: c42668b

@@ -59,7 +59,12 @@ impl<'a, S: ContextProvider> SqlToRel<'a, S> {
self.select_into(plan, select_into)
}
other => {
// TODO: check why set_expr_to_plan or the functions it calls need bigger
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@peter-toth lets create a ticket instead of todo so someone from community can pick this up

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AFAIK think this is something we want to resolve within this PR, if this approach is going to be taken

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would rather create a separate ticket for this investigation as the stack size requirement of the call tree below this point is not because of this PR.

Actually, the stak size requirement might be completely normal. Or it can also be a sign of an oportunity for improvement.

I will be offline for a few days, but will raise a ticket once I'm back.

Copy link
Contributor Author

@peter-toth peter-toth Nov 11, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The following is the output of lldb's thread backtrace when stack overflow happens without increasing the default min stack size (128KB). I logged the stack pointer and added an sp diff colum before the frames to track how the pointer changes from frame to frame.

Since we annotated set_expr_to_plan with [#recursive] the last check for the remaining stack size >128KB is at frame #45. It seems the frames above that point can easily occupy more than 128KB in debug builds so I think I will just remove the todo.

sp diff	| * thread #1, name = 'main', queue = 'com.apple.main-thread', stop reason = EXC_BAD_ACCESS (code=2, address=0x16f607f98)
    176	| * frame#0: sp=0x000000016f607f90 fp=0x000000016f608030 pc=0x000000010261d8d8 datafusion-cli`core::slice::raw::from_raw_parts::precondition_check::h057607e67305928b(data=<unavailable>, size=0, align=0, len=0) + 12 at ub_checks.rs:68
      0	| frame#1: sp=0x000000016f608040 fp=0x000000016f608120 pc=0x000000010254db90 datafusion-cli`alloc::string::String::push_str::haafe1885bd37d5ca [inlined] core::slice::raw::from_raw_parts::h165a5a78babcf312(data="1\xb0\x81`o\U00000001", len=1) + 12 at ub_checks.rs:77
      0	| frame#2: sp=0x000000016f608040 fp=0x000000016f608120 pc=0x000000010254db84 datafusion-cli`alloc::string::String::push_str::haafe1885bd37d5ca [inlined] core::slice::iter::Iter$LT$T$GT$::make_slice::haadde91dd12d62c0(self=0x000000016f608058) + 72 at macros.rs:92
      0	| frame#3: sp=0x000000016f608040 fp=0x000000016f608120 pc=0x000000010254db3c datafusion-cli`alloc::string::String::push_str::haafe1885bd37d5ca [inlined] core::slice::iter::Iter$LT$T$GT$::as_slice::h038880e4058bfb84(self=0x000000016f608058) at iter.rs:130
      0	| frame#4: sp=0x000000016f608040 fp=0x000000016f608120 pc=0x000000010254db3c datafusion-cli`alloc::string::String::push_str::haafe1885bd37d5ca [inlined] _$LT$alloc..vec..Vec$LT$T$C$A$GT$$u20$as$u20$alloc..vec..spec_extend..SpecExtend$LT$$RF$T$C$core..slice..iter..Iter$LT$T$GT$$GT$$GT$::spec_extend::h1c2c12b2d75ecea7(self=0x000000016f611cf0, iterator=Iter<u8> @ 0x000000016f608058) + 16 at spec_extend.rs:54
      0	| frame#5: sp=0x000000016f608040 fp=0x000000016f608120 pc=0x000000010254db2c datafusion-cli`alloc::string::String::push_str::haafe1885bd37d5ca [inlined] alloc::vec::Vec$LT$T$C$A$GT$::extend_from_slice::h40d95025d401d543(self=0x000000016f611cf0, other=(data_ptr = "1\xb0\x81`o\U00000001", length = 1)) + 40 at mod.rs:2606
    240	| frame#6: sp=0x000000016f608040 fp=0x000000016f608120 pc=0x000000010254db04 datafusion-cli`alloc::string::String::push_str::haafe1885bd37d5ca(self=0x000000016f611cf0, string=(data_ptr = "1\xb0\x81`o\U00000001", length = 1)) + 44 at string.rs:1064
     48	| frame#7: sp=0x000000016f608130 fp=0x000000016f608150 pc=0x000000010254d8c0 datafusion-cli`_$LT$alloc..string..String$u20$as$u20$core..fmt..Write$GT$::write_str::h7c4071c716ca3e29(self=0x000000016f611cf0, s=(data_ptr = "1\xb0\x81`o\U00000001", length = 1)) + 36 at string.rs:2964
      0	| frame#8: sp=0x000000016f608160 fp=0x000000016f608190 pc=0x0000000104cb3784 datafusion-cli`core::fmt::num::imp::_$LT$impl$u20$core..fmt..Display$u20$for$u20$i64$GT$::fmt::h9da92bf6deec5644 [inlined] core::fmt::num::imp::fmt_u64::hc248a1036ea7ac54 + 320 at num.rs:275 [opt]
     64	| frame#9: sp=0x000000016f608160 fp=0x000000016f608190 pc=0x0000000104cb3660 datafusion-cli`core::fmt::num::imp::_$LT$impl$u20$core..fmt..Display$u20$for$u20$i64$GT$::fmt::h9da92bf6deec5644 + 28 at num.rs:321 [opt]
     32	| frame#10: sp=0x000000016f6081a0 fp=0x000000016f6081b0 pc=0x0000000104a178f0 datafusion-cli`_$LT$$RF$T$u20$as$u20$core..fmt..Display$GT$::fmt::h830a9036cd96f312(self=0x000000016f608c88, f=0x000000016f6081c0) + 36 at mod.rs:2382
      0	| frame#11: sp=0x000000016f6081c0 fp=0x000000016f608230 pc=0x0000000104cab294 datafusion-cli`core::fmt::write::ha36a8060c13608ea [inlined] core::fmt::rt::Argument::fmt::h54965d9b1ee82264 + 400 at rt.rs:177 [opt]
    128	| frame#12: sp=0x000000016f6081c0 fp=0x000000016f608230 pc=0x0000000104cab288 datafusion-cli`core::fmt::write::ha36a8060c13608ea + 388 at mod.rs:1178 [opt]
    240	| frame#13: sp=0x000000016f608240 fp=0x000000016f608320 pc=0x0000000102835a7c datafusion-cli`core::fmt::Formatter::write_fmt::h187d7d278b868506(self=0x000000016f60a910, fmt=Arguments @ 0x000000016f608c90) + 356 at mod.rs:1658
   9664	| frame#14: sp=0x000000016f608330 fp=0x000000016f60a8e0 pc=0x00000001028af358 datafusion-cli`_$LT$datafusion_common..scalar..ScalarValue$u20$as$u20$core..fmt..Display$GT$::fmt::h4bd1df453e8d7ae1(self=0x000003a1a32c0410, f=0x000000016f60a910) + 4360 at mod.rs:3498
     32	| frame#15: sp=0x000000016f60a8f0 fp=0x000000016f60a900 pc=0x00000001027d3750 datafusion-cli`_$LT$$RF$T$u20$as$u20$core..fmt..Display$GT$::fmt::h205dc9bc7954f62a(self=0x000000016f60ad20, f=0x000000016f60a910) + 36 at mod.rs:2382
      0	| frame#16: sp=0x000000016f60a910 fp=0x000000016f60a980 pc=0x0000000104cab294 datafusion-cli`core::fmt::write::ha36a8060c13608ea [inlined] core::fmt::rt::Argument::fmt::h54965d9b1ee82264 + 400 at rt.rs:177 [opt]
    128	| frame#17: sp=0x000000016f60a910 fp=0x000000016f60a980 pc=0x0000000104cab288 datafusion-cli`core::fmt::write::ha36a8060c13608ea + 388 at mod.rs:1178 [opt]
    240	| frame#18: sp=0x000000016f60a990 fp=0x000000016f60aa70 pc=0x0000000102835a7c datafusion-cli`core::fmt::Formatter::write_fmt::h187d7d278b868506(self=0x000000016f60cb00, fmt=Arguments @ 0x000000016f60b000) + 356 at mod.rs:1658
   8288	| frame#19: sp=0x000000016f60aa80 fp=0x000000016f60cad0 pc=0x00000001028b1be8 datafusion-cli`_$LT$datafusion_common..scalar..ScalarValue$u20$as$u20$core..fmt..Debug$GT$::fmt::ha681602e5522374d(self=0x000003a1a32c0410, f=0x000000016f60cb00) + 1268 at mod.rs:3694
     32	| frame#20: sp=0x000000016f60cae0 fp=0x000000016f60caf0 pc=0x00000001027d36f0 datafusion-cli`_$LT$$RF$T$u20$as$u20$core..fmt..Debug$GT$::fmt::h3b2bdc95473e84a2(self=0x000000016f60d238, f=0x000000016f60cb00) + 36 at mod.rs:2382
      0	| frame#21: sp=0x000000016f60cb00 fp=0x000000016f60cb70 pc=0x0000000104cab294 datafusion-cli`core::fmt::write::ha36a8060c13608ea [inlined] core::fmt::rt::Argument::fmt::h54965d9b1ee82264 + 400 at rt.rs:177 [opt]
    128	| frame#22: sp=0x000000016f60cb00 fp=0x000000016f60cb70 pc=0x0000000104cab288 datafusion-cli`core::fmt::write::ha36a8060c13608ea + 388 at mod.rs:1178 [opt]
    240	| frame#23: sp=0x000000016f60cb80 fp=0x000000016f60cc60 pc=0x00000001024c6984 datafusion-cli`core::fmt::Formatter::write_fmt::h45a14b248602b930(self=0x000000016f60f440, fmt=Arguments @ 0x000000016f60d240) + 356 at mod.rs:1658
  10160	| frame#24: sp=0x000000016f60cc70 fp=0x000000016f60f410 pc=0x00000001025defd4 datafusion-cli`_$LT$datafusion_expr..expr..Expr$u20$as$u20$core..fmt..Display$GT$::fmt::h38fc982a50103db7(self=0x000003a1a32c0400, f=0x000000016f60f440) + 796 at expr.rs:2162
     32	| frame#25: sp=0x000000016f60f420 fp=0x000000016f60f430 pc=0x00000001024e6ee0 datafusion-cli`_$LT$$RF$T$u20$as$u20$core..fmt..Display$GT$::fmt::h2a1140c1608d8fd7(self=0x000000016f611df0, f=0x000000016f60f440) + 36 at mod.rs:2382
      0	| frame#26: sp=0x000000016f60f440 fp=0x000000016f60f4b0 pc=0x0000000104cab294 datafusion-cli`core::fmt::write::ha36a8060c13608ea [inlined] core::fmt::rt::Argument::fmt::h54965d9b1ee82264 + 400 at rt.rs:177 [opt]
    128	| frame#27: sp=0x000000016f60f440 fp=0x000000016f60f4b0 pc=0x0000000104cab288 datafusion-cli`core::fmt::write::ha36a8060c13608ea + 388 at mod.rs:1178 [opt]
    240	| frame#28: sp=0x000000016f60f4c0 fp=0x000000016f60f5a0 pc=0x00000001024c6984 datafusion-cli`core::fmt::Formatter::write_fmt::h45a14b248602b930(self=0x000000016f611d10, fmt=Arguments @ 0x000000016f60f970) + 356 at mod.rs:1658
  10032	| frame#29: sp=0x000000016f60f5b0 fp=0x000000016f611cd0 pc=0x00000001025db9cc datafusion-cli`_$LT$datafusion_expr..expr..SchemaDisplay$u20$as$u20$core..fmt..Display$GT$::fmt::h8c94e22a27138c7a(self=0x000000016f611df0, f=0x000000016f611d10) + 476 at expr.rs:1865
    192	| frame#30: sp=0x000000016f611ce0 fp=0x000000016f611d90 pc=0x00000001024e70d4 datafusion-cli`_$LT$T$u20$as$u20$alloc..string..ToString$GT$::to_string::h44a0f711b8fbc503(self=0x000000016f611df0) + 164 at string.rs:2565
   1088	| frame#31: sp=0x000000016f611da0 fp=0x000000016f6121d0 pc=0x000000010268f464 datafusion-cli`datafusion_expr::logical_plan::builder::validate_unique_names::_$u7b$$u7b$closure$u7d$$u7d$::h951842ef63760bfd((null)=(__0 = 0, __1 = 0x000003a1a32c0400)) + 96 at builder.rs:1491
     48	| frame#32: sp=0x000000016f6121e0 fp=0x000000016f612200 pc=0x0000000102696dc4 datafusion-cli`core::iter::traits::iterator::Iterator::try_for_each::call::_$u7b$$u7b$closure$u7d$$u7d$::heda250ffac751536((null)=<unavailable>, x=(__0 = 0, __1 = 0x000003a1a32c0400)) + 36 at iterator.rs:2464
    176	| frame#33: sp=0x000000016f612210 fp=0x000000016f6122b0 pc=0x0000000102670a90 datafusion-cli`_$LT$core..iter..adapters..enumerate..Enumerate$LT$I$GT$$u20$as$u20$core..iter..traits..iterator..Iterator$GT$::try_fold::enumerate::_$u7b$$u7b$closure$u7d$$u7d$::hf5d8a6bac2a3a904(acc=<unavailable>, item=0x000003a1a32c0400) + 56 at enumerate.rs:86
    416	| frame#34: sp=0x000000016f6122c0 fp=0x000000016f612450 pc=0x0000000102675470 datafusion-cli`core::iter::traits::iterator::Iterator::try_fold::h4e5a10f8b59e3a99(self=0x000000016f612550, init=<unavailable>, f={closure_env#0}<&datafusion_expr::expr::Expr, (), core::result::Result<(), datafusion_common::error::DataFusionError>, core::iter::traits::iterator::Iterator::try_for_each::call::{closure_env#0}<(usize, &datafusion_expr::expr::Expr), core::result::Result<(), datafusion_common::error::DataFusionError>, datafusion_expr::logical_plan::builder::validate_unique_names::{closure_env#0}<core::slice::iter::Iter<datafusion_expr::expr::Expr>>>> @ 0x000000016f612460) + 180 at iterator.rs:2405
     80	| frame#35: sp=0x000000016f612460 fp=0x000000016f6124a0 pc=0x000000010267048c datafusion-cli`_$LT$core..iter..adapters..enumerate..Enumerate$LT$I$GT$$u20$as$u20$core..iter..traits..iterator..Iterator$GT$::try_fold::hecc3581db0e4cd4e(self=0x000000016f612550, init=<unavailable>, fold={closure_env#0}<(usize, &datafusion_expr::expr::Expr), core::result::Result<(), datafusion_common::error::DataFusionError>, datafusion_expr::logical_plan::builder::validate_unique_names::{closure_env#0}<core::slice::iter::Iter<datafusion_expr::expr::Expr>>> @ 0x000000016f612488) + 72 at enumerate.rs:92
     48	| frame#36: sp=0x000000016f6124b0 fp=0x000000016f6124d0 pc=0x0000000102670c40 datafusion-cli`core::iter::traits::iterator::Iterator::try_for_each::he1035b27454235b1(self=0x000000016f612550, f={closure_env#0}<core::slice::iter::Iter<datafusion_expr::expr::Expr>> @ 0x000000016f6124c0) + 40 at iterator.rs:2467
    192	| frame#37: sp=0x000000016f6124e0 fp=0x000000016f612590 pc=0x000000010268f3a4 datafusion-cli`datafusion_expr::logical_plan::builder::validate_unique_names::hc158913cb1e3cacd(node_name=(data_ptr = "Projections./Users/ptoth/git/apache/datafusion/datafusion/expr/src/expr_fn.rs", length = 11), expressions=Iter<datafusion_expr::expr::Expr> @ 0x000000016f612570) + 188 at builder.rs:1490
   3936	| frame#38: sp=0x000000016f6125a0 fp=0x000000016f6134f0 pc=0x000000010268ff38 datafusion-cli`datafusion_expr::logical_plan::builder::project::h8b9a4e9e7f2e265c(plan=LogicalPlan @ 0x000000016f6136c0, expr=Vec<datafusion_expr::expr::Expr, alloc::alloc::Global> @ 0x000000016f613870) + 452 at builder.rs:1559
    976	| frame#39: sp=0x000000016f613500 fp=0x000000016f6138c0 pc=0x000000010268bd84 datafusion-cli`datafusion_expr::logical_plan::builder::LogicalPlanBuilder::project::h5c044b7ca35b2600(self=LogicalPlanBuilder @ 0x000000016f613898, expr=Vec<datafusion_expr::expr::Expr, alloc::alloc::Global> @ 0x000000016f613cb0) + 124 at builder.rs:473
   1184	| frame#40: sp=0x000000016f6138d0 fp=0x000000016f613d60 pc=0x0000000100b2ad88 datafusion-cli`datafusion_sql::select::_$LT$impl$u20$datafusion_sql..planner..SqlToRel$LT$S$GT$$GT$::project::h833e06409e028d78(self=0x000000016fd8ef58, input=LogicalPlan @ 0x000000016f616420, expr=Vec<datafusion_expr::expr::Expr, alloc::alloc::Global> @ 0x000000016f6165d0) + 356 at select.rs:712
  36768	| frame#41: sp=0x000000016f613d70 fp=0x000000016f61cd00 pc=0x0000000100b22b40 datafusion-cli`datafusion_sql::select::_$LT$impl$u20$datafusion_sql..planner..SqlToRel$LT$S$GT$$GT$::select_to_plan::hdd3ab0fc4bc777b4(self=0x000000016fd8ef58, select=Select @ 0x000000016f61e458, order_by=Vec<sqlparser::ast::query::OrderByExpr, alloc::alloc::Global> @ 0x000000016f61ec80, planner_context=0x000000016f61ce60) + 3708 at select.rs:92
  21392	| frame#42: sp=0x000000016f61cd10 fp=0x000000016f622090 pc=0x0000000100b1f6b8 datafusion-cli`datafusion_sql::query::_$LT$impl$u20$datafusion_sql..planner..SqlToRel$LT$S$GT$$GT$::query_to_plan::h7c93dc84a74431bf(self=0x000000016fd8ef58, query=Query @ 0x000000016f625748, outer_planner_context=0x000000016fd5ec70) + 1104 at query.rs:55
  15808	| frame#43: sp=0x000000016f6220a0 fp=0x000000016f625e50 pc=0x0000000100b384b0 datafusion-cli`datafusion_sql::set_expr::_$LT$impl$u20$datafusion_sql..planner..SqlToRel$LT$S$GT$$GT$::set_expr_to_plan::_$u7b$$u7b$closure$u7d$$u7d$::h7c1764edb0b43dd1 + 892 at set_expr.rs:44
      0	| frame#44: sp=0x000000016f625e60 fp=0x000000016f6288f0 pc=0x0000000100b34954 datafusion-cli`datafusion_sql::set_expr::_$LT$impl$u20$datafusion_sql..planner..SqlToRel$LT$S$GT$$GT$::set_expr_to_plan::h807844852e1e805e + 232 at lib.rs:55
  
since set_expr_to_plan is recursive, last remaining stack >128KB check happens here
  
  10912	| frame#45: sp=0x000000016f625e60 fp=0x000000016f6288f0 pc=0x0000000100b3486c datafusion-cli`datafusion_sql::set_expr::_$LT$impl$u20$datafusion_sql..planner..SqlToRel$LT$S$GT$$GT$::set_expr_to_plan::h807844852e1e805e(self=0x000000016fd8ef58, set_expr=SetExpr @ 0x000000016f62ac68, planner_context=0x000000016fd5ec70) + 212 at set_expr.rs:25
  15808	| frame#46: sp=0x000000016f628900 fp=0x000000016f62c6b0 pc=0x0000000100b3866c datafusion-cli`datafusion_sql::set_expr::_$LT$impl$u20$datafusion_sql..planner..SqlToRel$LT$S$GT$$GT$::set_expr_to_plan::_$u7b$$u7b$closure$u7d$$u7d$::h7c1764edb0b43dd1 + 1336 at set_expr.rs:41
      0	| frame#47: sp=0x000000016f62c6c0 fp=0x000000016f62f150 pc=0x0000000100b34954 datafusion-cli`datafusion_sql::set_expr::_$LT$impl$u20$datafusion_sql..planner..SqlToRel$LT$S$GT$$GT$::set_expr_to_plan::h807844852e1e805e + 232 at lib.rs:55
  10912	| frame#48: sp=0x000000016f62c6c0 fp=0x000000016f62f150 pc=0x0000000100b3486c datafusion-cli`datafusion_sql::set_expr::_$LT$impl$u20$datafusion_sql..planner..SqlToRel$LT$S$GT$$GT$::set_expr_to_plan::h807844852e1e805e(self=0x000000016fd8ef58, set_expr=SetExpr @ 0x000000016f62ffd8, planner_context=0x000000016fd5ec70) + 212 at set_expr.rs:25
  15808	| frame#49: sp=0x000000016f62f160 fp=0x000000016f632f10 pc=0x0000000100b38534 datafusion-cli`datafusion_sql::set_expr::_$LT$impl$u20$datafusion_sql..planner..SqlToRel$LT$S$GT$$GT$::set_expr_to_plan::_$u7b$$u7b$closure$u7d$$u7d$::h7c1764edb0b43dd1 + 1024 at set_expr.rs:40
      0	| frame#50: sp=0x000000016f632f20 fp=0x000000016f6359b0 pc=0x0000000100b34954 datafusion-cli`datafusion_sql::set_expr::_$LT$impl$u20$datafusion_sql..planner..SqlToRel$LT$S$GT$$GT$::set_expr_to_plan::h807844852e1e805e + 232 at lib.rs:55
  10912	| frame#51: sp=0x000000016f632f20 fp=0x000000016f6359b0 pc=0x0000000100b3486c datafusion-cli`datafusion_sql::set_expr::_$LT$impl$u20$datafusion_sql..planner..SqlToRel$LT$S$GT$$GT$::set_expr_to_plan::h807844852e1e805e(self=0x000000016fd8ef58, set_expr=SetExpr @ 0x000000016f636838, planner_context=0x000000016fd5ec70) + 212 at set_expr.rs:25
  15808	| frame#52: sp=0x000000016f6359c0 fp=0x000000016f639770 pc=0x0000000100b38534 datafusion-cli`datafusion_sql::set_expr::_$LT$impl$u20$datafusion_sql..planner..SqlToRel$LT$S$GT$$GT$::set_expr_to_plan::_$u7b$$u7b$closure$u7d$$u7d$::h7c1764edb0b43dd1 + 1024 at set_expr.rs:40

... many repeating frames ...

  15808	| frame#886: sp=0x000000016fd4b200 fp=0x000000016fd4efb0 pc=0x0000000100b38534 datafusion-cli`datafusion_sql::set_expr::_$LT$impl$u20$datafusion_sql..planner..SqlToRel$LT$S$GT$$GT$::set_expr_to_plan::_$u7b$$u7b$closure$u7d$$u7d$::h7c1764edb0b43dd1 + 1024 at set_expr.rs:40
      0	| frame#887: sp=0x000000016fd4efc0 fp=0x000000016fd51a50 pc=0x0000000100b34954 datafusion-cli`datafusion_sql::set_expr::_$LT$impl$u20$datafusion_sql..planner..SqlToRel$LT$S$GT$$GT$::set_expr_to_plan::h807844852e1e805e + 232 at lib.rs:55
  10912	| frame#888: sp=0x000000016fd4efc0 fp=0x000000016fd51a50 pc=0x0000000100b3486c datafusion-cli`datafusion_sql::set_expr::_$LT$impl$u20$datafusion_sql..planner..SqlToRel$LT$S$GT$$GT$::set_expr_to_plan::h807844852e1e805e(self=0x000000016fd8ef58, set_expr=SetExpr @ 0x000000016fd528d8, planner_context=0x000000016fd5ec70) + 212 at set_expr.rs:25
  15808	| frame#889: sp=0x000000016fd51a60 fp=0x000000016fd55810 pc=0x0000000100b38534 datafusion-cli`datafusion_sql::set_expr::_$LT$impl$u20$datafusion_sql..planner..SqlToRel$LT$S$GT$$GT$::set_expr_to_plan::_$u7b$$u7b$closure$u7d$$u7d$::h7c1764edb0b43dd1 + 1024 at set_expr.rs:40
      0	| frame#890: sp=0x000000016fd55820 fp=0x000000016fd582b0 pc=0x0000000100b34954 datafusion-cli`datafusion_sql::set_expr::_$LT$impl$u20$datafusion_sql..planner..SqlToRel$LT$S$GT$$GT$::set_expr_to_plan::h807844852e1e805e + 232 at lib.rs:55
  10912	| frame#891: sp=0x000000016fd55820 fp=0x000000016fd582b0 pc=0x0000000100b3486c datafusion-cli`datafusion_sql::set_expr::_$LT$impl$u20$datafusion_sql..planner..SqlToRel$LT$S$GT$$GT$::set_expr_to_plan::h807844852e1e805e(self=0x000000016fd8ef58, set_expr=SetExpr @ 0x000000016fd59138, planner_context=0x000000016fd5ec70) + 212 at set_expr.rs:25
  15808	| frame#892: sp=0x000000016fd582c0 fp=0x000000016fd5c070 pc=0x0000000100b38534 datafusion-cli`datafusion_sql::set_expr::_$LT$impl$u20$datafusion_sql..planner..SqlToRel$LT$S$GT$$GT$::set_expr_to_plan::_$u7b$$u7b$closure$u7d$$u7d$::h7c1764edb0b43dd1 + 1024 at set_expr.rs:40
      0	| frame#893: sp=0x000000016fd5c080 fp=0x000000016fd5eb10 pc=0x0000000100b34954 datafusion-cli`datafusion_sql::set_expr::_$LT$impl$u20$datafusion_sql..planner..SqlToRel$LT$S$GT$$GT$::set_expr_to_plan::h807844852e1e805e + 232 at lib.rs:55
  10912	| frame#894: sp=0x000000016fd5c080 fp=0x000000016fd5eb10 pc=0x0000000100b3486c datafusion-cli`datafusion_sql::set_expr::_$LT$impl$u20$datafusion_sql..planner..SqlToRel$LT$S$GT$$GT$::set_expr_to_plan::h807844852e1e805e(self=0x000000016fd8ef58, set_expr=SetExpr @ 0x000000016fd619a0, planner_context=0x000000016fd5ec70) + 212 at set_expr.rs:25
  21392	| frame#895: sp=0x000000016fd5eb20 fp=0x000000016fd63ea0 pc=0x0000000100b1f514 datafusion-cli`datafusion_sql::query::_$LT$impl$u20$datafusion_sql..planner..SqlToRel$LT$S$GT$$GT$::query_to_plan::h7c93dc84a74431bf(self=0x000000016fd8ef58, query=Query @ 0x000000016fd658f8, outer_planner_context=0x000000016fd8d128) + 684 at query.rs:63
 165024	| frame#896: sp=0x000000016fd63eb0 fp=0x000000016fd8c340 pc=0x0000000100037e18 datafusion-cli`datafusion_sql::statement::_$LT$impl$u20$datafusion_sql..planner..SqlToRel$LT$S$GT$$GT$::sql_statement_to_plan_with_context_impl::hf1ca76c0cac43a21(self=0x000000016fd8ef58, statement=Statement @ 0x000000016fd8c360, planner_context=0x000000016fd8d128) + 6436 at statement.rs:216
   3680	| frame#897: sp=0x000000016fd8c350 fp=0x000000016fd8d1a0 pc=0x00000001000345b8 datafusion-cli`datafusion_sql::statement::_$LT$impl$u20$datafusion_sql..planner..SqlToRel$LT$S$GT$$GT$::sql_statement_to_plan::hfc0ac332eae64477(self=0x000000016fd8ef58, statement=<unavailable>) + 108 at statement.rs:179
   6416	| frame#898: sp=0x000000016fd8d1b0 fp=0x000000016fd8eab0 pc=0x0000000100031270 datafusion-cli`datafusion_sql::statement::_$LT$impl$u20$datafusion_sql..planner..SqlToRel$LT$S$GT$$GT$::statement_to_plan::h6c23c83262fd9dd5(self=0x000000016fd8ef58, statement=Statement @ 0x000000016fd8ef70) + 356 at statement.rs:167
   3024	| frame#899: sp=0x000000016fd8eac0 fp=0x000000016fd8f680 pc=0x00000001000978fc datafusion-cli`datafusion::execution::session_state::SessionState::statement_to_plan::_$u7b$$u7b$closure$u7d$$u7d$::h52a740fadd29b345((null)=0x000000016fdec970) + 1988 at session_state.rs:554
  57280	| frame#900: sp=0x000000016fd8f690 fp=0x000000016fd9d640 pc=0x000000010001ffd0 datafusion-cli`datafusion_cli::exec::create_plan::_$u7b$$u7b$closure$u7d$$u7d$::h7478b4c5b2d38449((null)=0x000000016fdec970) + 668 at exec.rs:309
  38912	| frame#901: sp=0x000000016fd9d650 fp=0x000000016fda6e40 pc=0x0000000100021270 datafusion-cli`datafusion_cli::exec::exec_and_print::_$u7b$$u7b$closure$u7d$$u7d$::h3435e8173d4260ec((null)=0x000000016fdec970) + 3040 at exec.rs:231
  66256	| frame#902: sp=0x000000016fda6e50 fp=0x000000016fdb7110 pc=0x00000001000248a4 datafusion-cli`datafusion_cli::exec::exec_from_lines::_$u7b$$u7b$closure$u7d$$u7d$::h9fab5c9bc60b58c4((null)=0x000000016fdec970) + 660 at exec.rs:83
  33568	| frame#903: sp=0x000000016fdb7120 fp=0x000000016fdbf430 pc=0x0000000100024324 datafusion-cli`datafusion_cli::exec::exec_from_files::_$u7b$$u7b$closure$u7d$$u7d$::h85cc58b94f18332d((null)=0x000000016fdec970) + 532 at exec.rs:119
 148080	| frame#904: sp=0x000000016fdbf440 fp=0x000000016fde36a0 pc=0x000000010009d684 datafusion-cli`datafusion_cli::main_inner::_$u7b$$u7b$closure$u7d$$u7d$::hd6280e78fe966330((null)=0x000000016fdec970) + 5780 at main.rs:224
  37280	| frame#905: sp=0x000000016fde36b0 fp=0x000000016fdec840 pc=0x000000010009e648 datafusion-cli`datafusion_cli::main::_$u7b$$u7b$closure$u7d$$u7d$::h50dcf540aa144e00((null)=0x000000016fdec970) + 388 at main.rs:131
     64	| frame#906: sp=0x000000016fdec850 fp=0x000000016fdec880 pc=0x000000010007a358 datafusion-cli`_$LT$core..pin..Pin$LT$P$GT$$u20$as$u20$core..future..future..Future$GT$::poll::h85c9e154d035acda(self=Pin<&mut core::pin::Pin<alloc::boxed::Box<datafusion_cli::main::{async_block_env#0}, alloc::alloc::Global>>> @ 0x000000016fdec860, cx=0x000000016fdec970) + 56 at future.rs:123
     48	| frame#907: sp=0x000000016fdec890 fp=0x000000016fdec8b0 pc=0x0000000100052c1c datafusion-cli`tokio::runtime::park::CachedParkThread::block_on::_$u7b$$u7b$closure$u7d$$u7d$::h6ca7e69d611b9986 + 48 at park.rs:281
      0	| frame#908: sp=0x000000016fdec8c0 fp=0x000000016fdeca40 pc=0x0000000100051f44 datafusion-cli`tokio::runtime::park::CachedParkThread::block_on::h3ccb9d248567b894 + 88 at coop.rs:107
      0	| frame#909: sp=0x000000016fdec8c0 fp=0x000000016fdeca40 pc=0x0000000100051eec datafusion-cli`tokio::runtime::park::CachedParkThread::block_on::h3ccb9d248567b894 [inlined] tokio::runtime::coop::budget::h540ddd3774674391(f={closure_env#0}<core::pin::Pin<alloc::boxed::Box<datafusion_cli::main::{async_block_env#0}, alloc::alloc::Global>>> @ 0x000000016fdec9e8) + 128 at coop.rs:73
    400	| frame#910: sp=0x000000016fdec8c0 fp=0x000000016fdeca40 pc=0x0000000100051e6c datafusion-cli`tokio::runtime::park::CachedParkThread::block_on::h3ccb9d248567b894(self=0x000000016fdeca76, f=(__pointer = 0x000003a1881d1000)) + 364 at park.rs:281
     80	| frame#911: sp=0x000000016fdeca50 fp=0x000000016fdeca90 pc=0x000000010006be98 datafusion-cli`tokio::runtime::context::blocking::BlockingRegionGuard::block_on::h62487a9f975fcbb3(self=0x000000016fdecb40, f=(__pointer = 0x000003a1881d1000)) + 84 at blocking.rs:66
     48	| frame#912: sp=0x000000016fdecaa0 fp=0x000000016fdecac0 pc=0x000000010009a700 datafusion-cli`tokio::runtime::scheduler::multi_thread::MultiThread::block_on::_$u7b$$u7b$closure$u7d$$u7d$::hd7af37b640c5af6c(blocking=0x000000016fdecb40) + 44 at mod.rs:87
    208	| frame#913: sp=0x000000016fdecad0 fp=0x000000016fdecb90 pc=0x00000001000516c0 datafusion-cli`tokio::runtime::context::runtime::enter_runtime::hdcf7df0044a46b81(handle=0x000000016fdfa458, allow_block_in_place=true, f={closure_env#0}<core::pin::Pin<alloc::boxed::Box<datafusion_cli::main::{async_block_env#0}, alloc::alloc::Global>>> @ 0x000000016fdecaf8) + 200 at runtime.rs:65
     48	| frame#914: sp=0x000000016fdecba0 fp=0x000000016fdecbc0 pc=0x000000010009a5a0 datafusion-cli`tokio::runtime::scheduler::multi_thread::MultiThread::block_on::h020b6ec45fbbc9b7(self=0x000000016fdfa430, handle=0x000000016fdfa458, future=(__pointer = 0x000003a1881d1000)) + 60 at mod.rs:86
    144	| frame#915: sp=0x000000016fdecbd0 fp=0x000000016fdecc50 pc=0x000000010000de78 datafusion-cli`tokio::runtime::runtime::Runtime::block_on_inner::hd4eb19af861902e6(self=0x000000016fdfa428, future=(__pointer = 0x000003a1881d1000), (null)=(_pd = core::marker::PhantomData<void *> @ 0x000000016fdecc2f)) + 176 at runtime.rs:370
  36864	| frame#916: sp=0x000000016fdecc60 fp=0x000000016fdf5c50 pc=0x000000010000e2d8 datafusion-cli`tokio::runtime::runtime::Runtime::block_on::hccce907ae0baf82f(self=0x000000016fdfa428, future={async_block_env#0} @ 0x000000016fdfa5b0) + 588 at runtime.rs:340
  37168	| frame#917: sp=0x000000016fdf5c60 fp=0x000000016fdfed80 pc=0x00000001000a1c6c datafusion-cli`datafusion_cli::main::h4aa3a390bc9647ff + 312 at main.rs:131
     32	| frame#918: sp=0x000000016fdfed90 fp=0x000000016fdfeda0 pc=0x0000000100088960 datafusion-cli`core::ops::function::FnOnce::call_once::h3a62445df178aa2f((null)=(datafusion-cli`datafusion_cli::main::h4aa3a390bc9647ff at main.rs:130), (null)=<unavailable>) + 20 at function.rs:250
     48	| frame#919: sp=0x000000016fdfedb0 fp=0x000000016fdfedd0 pc=0x000000010009886c datafusion-cli`std::sys::backtrace::__rust_begin_short_backtrace::h3260c992e158a8cf(f=(datafusion-cli`datafusion_cli::main::h4aa3a390bc9647ff at main.rs:130)) + 24 at backtrace.rs:154
     48	| frame#920: sp=0x000000016fdfede0 fp=0x000000016fdfee00 pc=0x000000010005885c datafusion-cli`std::rt::lang_start::_$u7b$$u7b$closure$u7d$$u7d$::h39f28f7bd1d722f5 + 28 at rt.rs:164
      0	| frame#921: sp=0x000000016fdfee10 fp=0x000000016fdfeeb0 pc=0x0000000104c7fac4 datafusion-cli`std::rt::lang_start_internal::h9e88109c8deb8787 [inlined] core::ops::function::impls::_$LT$impl$u20$core..ops..function..FnOnce$LT$A$GT$$u20$for$u20$$RF$F$GT$::call_once::hf77a1752ba39c45f + 808 at function.rs:284 [opt]
      0	| frame#922: sp=0x000000016fdfee10 fp=0x000000016fdfeeb0 pc=0x0000000104c7fabc datafusion-cli`std::rt::lang_start_internal::h9e88109c8deb8787 [inlined] std::panicking::try::do_call::hf02556a6b145ecfc + 4 at panicking.rs:554 [opt]
      0	| frame#923: sp=0x000000016fdfee10 fp=0x000000016fdfeeb0 pc=0x0000000104c7fab8 datafusion-cli`std::rt::lang_start_internal::h9e88109c8deb8787 [inlined] std::panicking::try::h2bb23dba91be7e3b at panicking.rs:518 [opt]
      0	| frame#924: sp=0x000000016fdfee10 fp=0x000000016fdfeeb0 pc=0x0000000104c7fab8 datafusion-cli`std::rt::lang_start_internal::h9e88109c8deb8787 [inlined] std::panic::catch_unwind::h1844bc6507215052 at panic.rs:345 [opt]
      0	| frame#925: sp=0x000000016fdfee10 fp=0x000000016fdfeeb0 pc=0x0000000104c7fab8 datafusion-cli`std::rt::lang_start_internal::h9e88109c8deb8787 [inlined] std::rt::lang_start_internal::_$u7b$$u7b$closure$u7d$$u7d$::ha90e2c319598814e at rt.rs:143 [opt]
      0	| frame#926: sp=0x000000016fdfee10 fp=0x000000016fdfeeb0 pc=0x0000000104c7fab8 datafusion-cli`std::rt::lang_start_internal::h9e88109c8deb8787 [inlined] std::panicking::try::do_call::h7de69f625a47132a at panicking.rs:554 [opt]
      0	| frame#927: sp=0x000000016fdfee10 fp=0x000000016fdfeeb0 pc=0x0000000104c7fab8 datafusion-cli`std::rt::lang_start_internal::h9e88109c8deb8787 [inlined] std::panicking::try::h2198f44c68c232f7 at panicking.rs:518 [opt]
      0	| frame#928: sp=0x000000016fdfee10 fp=0x000000016fdfeeb0 pc=0x0000000104c7fab8 datafusion-cli`std::rt::lang_start_internal::h9e88109c8deb8787 [inlined] std::panic::catch_unwind::h40a34eeb64f44ac6 at panic.rs:345 [opt]
    176	| frame#929: sp=0x000000016fdfee10 fp=0x000000016fdfeeb0 pc=0x0000000104c7fab8 datafusion-cli`std::rt::lang_start_internal::h9e88109c8deb8787 + 796 at rt.rs:143 [opt]
     96	| frame#930: sp=0x000000016fdfeec0 fp=0x000000016fdfef10 pc=0x0000000100058828 datafusion-cli`std::rt::lang_start::h7d6b441f4e85b3ed(main=(datafusion-cli`datafusion_cli::main::h4aa3a390bc9647ff at main.rs:130), argc=3, argv=0x000000016fdff170, sigpipe='\0') + 84 at rt.rs:163
     16	| frame#931: sp=0x000000016fdfef20 fp=0x000000016fdfef20 pc=0x00000001000a1d18 datafusion-cli`main + 36

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I changed the comment in 17319c2

Copy link
Contributor

@comphead comphead left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @peter-toth it sounds interesting. Wondering if the compiler or runtime can optimize the code now knowing the method is recursive.

LIke https://www.scala-lang.org/api/3.1.1/scala/annotation/tailrec.html

@peter-toth
Copy link
Contributor Author

peter-toth commented Nov 8, 2024

Wondering if the compiler or runtime can optimize the code now knowing the method is recursive.

LIke https://www.scala-lang.org/api/3.1.1/scala/annotation/tailrec.html

Yeah, Scala provides tailrec optimization, but as far as I know it is missing from Rust.

The new crates and the "recursive" annotation just provide dynamic stack growth to avoid stack overflow. It doesn't mean that the functions will be tail call eliminated.

Copy link
Contributor

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In my opinion we should do this (use stacker to avoid stack overflows).

We have spent quite a while in the past chasing down / fixing stack overflows and we still don't have it fixed in all cases. This solution feels like we can finally fix it for good.

Many of my concerns about using stacker are related to using what appears to be dark magic to avoid stack overflows, @peter-toth pointed out that it was used by the rust compiler which I also verified: #13177 (comment)

It would be nice to avoid the recursive (as much as I love @orlp 👋 ) in my opinion. Would it be possible to "vendor" the recursive macro (aka copy it into the datafusion repo) as a datafusion-proc-macross crate?

cc @jayzhan211 @Dandandan

@@ -168,6 +169,7 @@ impl<'a, S: ContextProvider> SqlToRel<'a, S> {

/// Internal implementation. Use
/// [`Self::sql_expr_to_logical_expr`] to plan exprs.
#[recursive]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you should be able to remove the note in sql_array_literal about stack overflow

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, removed in 5659520.

// TODO: check why set_expr_to_plan or the functions it calls need bigger
// minimum stack
// The functions called from `set_expr_to_plan()` need more than 128KB
// stack in debug builds as investigated in:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

Copy link
Contributor

@comphead comphead left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm thanks @peter-toth

@comphead comphead merged commit bab02b6 into apache:main Nov 11, 2024
27 checks passed
jayzhan211 pushed a commit to jayzhan211/datafusion that referenced this pull request Nov 12, 2024
* Add stacker and recursive

* add test

* fine tune set_expr_to_plan min stack size

* format tomls

* add recursive annotation to more custom recursive rules

* fix comments

* fix comment about min stack requirement

---------

Co-authored-by: blaginin <dmitrii@blaginin.me>
@peter-toth
Copy link
Contributor Author

Thanks for the review @blaginin, @comphead, @alamb!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
common Related to common crate logical-expr Logical plan and expressions optimizer Optimizer rules sql SQL Planner
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Deeply recursive UNION queries cause stack overflow
4 participants