Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Join on multiple keys with Lazy DataFrames panics #17004

Open
2 tasks done
sdrap opened this issue Jun 17, 2024 · 2 comments
Open
2 tasks done

Join on multiple keys with Lazy DataFrames panics #17004

sdrap opened this issue Jun 17, 2024 · 2 comments
Labels
A-panic Area: code that results in panic exceptions bug Something isn't working needs triage Awaiting prioritization by a maintainer rust Related to Rust Polars

Comments

@sdrap
Copy link

sdrap commented Jun 17, 2024

A puzzling panic for multicolumn joins in with lazy dataframes.

Checks

  • I have checked that this issue has not already been reported.
  • I have confirmed this bug exists on the latest version of Polars.

Reproducible example

Joining two dataframes on 2 keys is ok

let df1 = df!("key1" => &["A", "B", "C"],
              "key2" => &["X", "Y", "Z"],
              "val1" => &[1, 2, 3])?;
let df2 = df!("key1" => &["A", "B", "D"],
              "key2" => &["X", "Y", "W"],
              "val2" => &[4, 5, 6])?;

// join on "key1" and "key2"
let joined_df = df1.join(
    &df2,
    &["key1", "key2"],
    &["key1", "key2"],
    JoinArgs::new(JoinType::Inner),
)?;
println!("{:?}", joined_df);

Lazy joining on one key is ok

// join on "key1" and "key2"
let lazy1_joined_df = df1.clone().lazy().join(
    df2.clone().lazy(),
    [col("key1")],
    [col("key1")],
    JoinArgs::new(JoinType::Inner),
).collect()?;
println!("{:?}", lazy1_joined_df);

Lazy joining on two keys panics

// join on "key1" and "key2"
let lazy2_joined_df = df1.clone().lazy().join(
    df2.clone().lazy(),
    [cols(["key1", "key2"])],
    [cols(["key1", "key2"])],
    JoinArgs::new(JoinType::Inner),
).collect()?;
println!("{:?}", lazy2_joined_df);

// Same for this one
let lazy_joined_df = df1
    .clone()
    .lazy()
    .join(
        df2.clone().lazy(),
        [col("key1"), col("key2")],
        [col("key1"), col("key2")],
        JoinArgs::new(JoinType::Inner),
    )
    .collect()?;

Log output

No response

Issue description

Joining two dataframes in non lazy modes on multiple keys is ok while on Lazy it panics (either using cols or a list of col).

Expected behavior

I expect the same output with or without lazy regardless of one or multiple keys.

Installed versions

  • OS: Linux
  • rust: 1.79
  • polars: 0.40 with features ["lazy", "temporal", "describe", "json", "parquet", "dtype-datetime"]
@sdrap sdrap added bug Something isn't working needs triage Awaiting prioritization by a maintainer rust Related to Rust Polars labels Jun 17, 2024
@ritchie46
Copy link
Member

What kind of panic do you get? Can you show the stack trace?

@sdrap
Copy link
Author

sdrap commented Jun 17, 2024

Backtrace = 1 I get this (the first counter examples with cols)

thread 'main' panicked at /home/sdrapeau/.cargo/registry/src/index.crates.io-6f17d22bba15001f/polars-plan-0.40.0/src/logical_plan/conversion/expr_to_ir.rs:367:33:
no `columns` expected at this point
stack backtrace:
   0: rust_begin_unwind
             at /rustc/129f3b9964af4d4a709d1383930ade12dfe7c081/library/std/src/panicking.rs:652:5
   1: core::panicking::panic_fmt
             at /rustc/129f3b9964af4d4a709d1383930ade12dfe7c081/library/core/src/panicking.rs:72:14
   2: polars_plan::logical_plan::conversion::expr_to_ir::to_aexpr_impl::{{closure}}
             at /home/sdrapeau/.cargo/registry/src/index.crates.io-6f17d22bba15001f/polars-plan-0.40.0/src/logical_plan/conversion/expr_to_ir.rs:367:33
   3: stacker::maybe_grow
             at /home/sdrapeau/.cargo/registry/src/index.crates.io-6f17d22bba15001f/stacker-0.1.15/src/lib.rs:55:9
   4: polars_plan::logical_plan::conversion::expr_to_ir::to_aexpr_impl
             at /home/sdrapeau/.cargo/registry/src/index.crates.io-6f17d22bba15001f/polars-plan-0.40.0/src/logical_plan/conversion/expr_to_ir.rs:108:1
   5: polars_plan::logical_plan::conversion::expr_to_ir::to_aexpr_impl_materialized_lit
             at /home/sdrapeau/.cargo/registry/src/index.crates.io-6f17d22bba15001f/polars-plan-0.40.0/src/logical_plan/conversion/expr_to_ir.rs:104:5
   6: polars_plan::logical_plan::conversion::expr_to_ir::to_aexpr
             at /home/sdrapeau/.cargo/registry/src/index.crates.io-6f17d22bba15001f/polars-plan-0.40.0/src/logical_plan/conversion/expr_to_ir.rs:26:5
   7: polars_plan::dsl::expr::Expr::to_field_amortized
             at /home/sdrapeau/.cargo/registry/src/index.crates.io-6f17d22bba15001f/polars-plan-0.40.0/src/dsl/expr.rs:313:20
   8: polars_plan::logical_plan::schema::det_join_schema
             at /home/sdrapeau/.cargo/registry/src/index.crates.io-6f17d22bba15001f/polars-plan-0.40.0/src/logical_plan/schema.rs:317:29
   9: polars_plan::logical_plan::conversion::dsl_to_ir::to_alp_impl::{{closure}}
             at /home/sdrapeau/.cargo/registry/src/index.crates.io-6f17d22bba15001f/polars-plan-0.40.0/src/logical_plan/conversion/dsl_to_ir.rs:354:17
  10: stacker::maybe_grow
             at /home/sdrapeau/.cargo/registry/src/index.crates.io-6f17d22bba15001f/stacker-0.1.15/src/lib.rs:55:9
  11: polars_plan::logical_plan::conversion::dsl_to_ir::to_alp_impl
             at /home/sdrapeau/.cargo/registry/src/index.crates.io-6f17d22bba15001f/polars-plan-0.40.0/src/logical_plan/conversion/dsl_to_ir.rs:59:1
  12: polars_plan::logical_plan::conversion::dsl_to_ir::to_alp
             at /home/sdrapeau/.cargo/registry/src/index.crates.io-6f17d22bba15001f/polars-plan-0.40.0/src/logical_plan/conversion/dsl_to_ir.rs:53:5
  13: polars_plan::logical_plan::optimizer::optimize
             at /home/sdrapeau/.cargo/registry/src/index.crates.io-6f17d22bba15001f/polars-plan-0.40.0/src/logical_plan/optimizer/mod.rs:94:22
  14: polars_lazy::frame::LazyFrame::optimize_with_scratch
             at /home/sdrapeau/.cargo/registry/src/index.crates.io-6f17d22bba15001f/polars-lazy-0.40.0/src/frame/mod.rs:542:22
  15: polars_lazy::frame::LazyFrame::prepare_collect_post_opt
             at /home/sdrapeau/.cargo/registry/src/index.crates.io-6f17d22bba15001f/polars-lazy-0.40.0/src/frame/mod.rs:595:13
  16: polars_lazy::frame::LazyFrame::_collect_post_opt
             at /home/sdrapeau/.cargo/registry/src/index.crates.io-6f17d22bba15001f/polars-lazy-0.40.0/src/frame/mod.rs:616:49
  17: polars_lazy::frame::LazyFrame::collect
             at /home/sdrapeau/.cargo/registry/src/index.crates.io-6f17d22bba15001f/polars-lazy-0.40.0/src/frame/mod.rs:646:9
  18: playground::main
             at ./src/main.rs:27:27
  19: core::ops::function::FnOnce::call_once
             at /rustc/129f3b9964af4d4a709d1383930ade12dfe7c081/library/core/src/ops/function.rs:250:5
note: Some details are omitted, run with `RUST_BACKTRACE=full` for a verbose backtrace.

@coastalwhite coastalwhite added the A-panic Area: code that results in panic exceptions label Jun 17, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-panic Area: code that results in panic exceptions bug Something isn't working needs triage Awaiting prioritization by a maintainer rust Related to Rust Polars
Projects
None yet
Development

No branches or pull requests

3 participants