Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Diff on parquet filter agg #11257

Open
zml1206 opened this issue Oct 15, 2024 · 3 comments
Open

Diff on parquet filter agg #11257

zml1206 opened this issue Oct 15, 2024 · 3 comments
Labels
bug Something isn't working triage Newly created issue that needs attention.

Comments

@zml1206
Copy link
Contributor

zml1206 commented Oct 15, 2024

Bug description

Write parquet file requires disable gluten.

spark.sql("set spark.gluten.enabled=false")
spark.range(1000).selectExpr("id%2 as c1", "id%5 as c2", "id as c3").write.mode("overwrite").parquet("tmp/t1")
spark.sql("set spark.gluten.enabled=true")
spark.read.parquet("tmp/t1").createOrReplaceTempView("t1")
spark.sql("select c2, sum(c3)  from t1 where  c1= 1 group by c2").show

result

+---+---------------+
| c2|        sum(c3)|
+---+---------------+
|  0|559882429285360|
|  1|559885503421750|
|  3|839826576815406|
|  2|839827141809990|
|  4|559885785918562|
+---+---------------+

Through testing, found that #11010 caused, it worked after reverted it.

System information

Velox System Info v0.0.2
Commit: 2883361
CMake Version: 3.28.3
System: Linux-5.15.0-113-generic
Arch: x86_64
C++ Compiler: /usr/bin/c++
C++ Compiler Version: 11.4.0
C Compiler: /usr/bin/cc
C Compiler Version: 11.4.0
CMake Prefix Path: /usr/local;/usr;/;/usr/local/lib/python3.10/dist-packages/cmake/data;/usr/local;/usr/X11R6;/usr/pkg;/opt

Relevant logs

No response

@zml1206 zml1206 added bug Something isn't working triage Newly created issue that needs attention. labels Oct 15, 2024
@zml1206
Copy link
Contributor Author

zml1206 commented Oct 15, 2024

cc @Yuhta

@Yuhta
Copy link
Contributor

Yuhta commented Oct 16, 2024

Can you upload the tmp/t1 here?

@zml1206
Copy link
Contributor Author

zml1206 commented Oct 17, 2024

Can you upload the tmp/t1 here?

t1.tar.gz
and use spark.range(1000) easier to reproduce. @Yuhta

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working triage Newly created issue that needs attention.
Projects
None yet
Development

No branches or pull requests

2 participants