Skip to content

SortMergeJoin: The query stuck when join filter is set and more matched rows than batch size #10491

Closed
@comphead

Description

@comphead
          I'm not able to run such test just because of other SMJ issue not related to this PR:

If the join filter is set and for the same streaming index there are matching rows more or equal to a batch size then the query just stuck. Likely the problem is in polling state and it can be easily reproduced on main branch.

  #[tokio::test]
    async fn test_11() -> Result<()> {
        let ctx: SessionContext = SessionContext::new();

        let sql = "set datafusion.optimizer.prefer_hash_join = false;";
        let _ = ctx.sql(sql).await?.collect().await?;

        let sql = "set datafusion.execution.batch_size = 1";
        let _ = ctx.sql(sql).await?.collect().await?;

        let sql = "
        select * from (
        with
        t1 as (
            select 12 a, 12 b
            ),
        t2 as (
            select 12 a, 12 b
            )
            select t1.* from t1 join t2 on t1.a = t2.b where t1.a > t2.b
        ) order by 1, 2;
        ";

        let actual = ctx.sql(sql).await?.collect().await?;


        Ok(())
    }

I'll file a separate issue for this one, but perhaps we can go with this PR because potential problem you talking about cannot ever be hit because of the issue above

Originally posted by @comphead in #10304 (comment)

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions