Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

datafusion-cli: results of group by aggregation query not showing #8702

Closed
Jefffrey opened this issue Jan 1, 2024 · 7 comments · Fixed by #8895
Closed

datafusion-cli: results of group by aggregation query not showing #8702

Jefffrey opened this issue Jan 1, 2024 · 7 comments · Fixed by #8895
Assignees
Labels
bug Something isn't working regression Something that used to work no longer does

Comments

@Jefffrey
Copy link
Contributor

Jefffrey commented Jan 1, 2024

Describe the bug

datafusion-cli isn't showing results for queries involving selecting count when group by.

To Reproduce

Given input file:

col1,col2
A,
B,b
C,c

Querying this file in datafusion-cli:

DataFusion CLI v34.0.0
❯ CREATE EXTERNAL TABLE kumachan STORED AS CSV WITH HEADER ROW LOCATION '/home/jeffrey/Downloads/kumachan.csv';
0 rows in set. Query took 0.029 seconds.

❯ select * from kumachan;
+------+------+
| col1 | col2 |
+------+------+
| A    |      |
| B    | b    |
| C    | c    |
+------+------+
3 rows in set. Query took 0.005 seconds.

❯ select count(*) from kumachan group by col1;
3 rows in set. Query took 0.010 seconds.

❯

Expected behavior

Should get results like:

+----------+
| COUNT(*) |
+----------+
| 1        |
| 1        |
| 1        |
+----------+

Additional context

No response

@Jefffrey Jefffrey added the bug Something isn't working label Jan 1, 2024
@alamb
Copy link
Contributor

alamb commented Jan 1, 2024

Maybe some other fallout from #8651 -- I had some work in progress to try and consolidate / clean up the output code more in https://github.com/alamb/arrow-datafusion/tree/alamb/cli-cleanup but I have not finished it

@alamb alamb added the regression Something that used to work no longer does label Jan 16, 2024
@alamb
Copy link
Contributor

alamb commented Jan 16, 2024

I think this is a regression and we should try and fix it before datafusion 35 release #8863 -- cc @andygrove

@alamb alamb mentioned this issue Jan 16, 2024
3 tasks
@alamb
Copy link
Contributor

alamb commented Jan 16, 2024

I can try to help debug this tomorrow if no one else beats me to it

@Jefffrey
Copy link
Contributor Author

Problem is here:

https://github.com/apache/arrow-datafusion/blob/ffaa67904ed0ca454267ccc5832582bcb669a5c0/datafusion-cli/src/print_format.rs#L164

Specifically the batches[0].num_rows() == 0

It seems my query will have some empty RecordBatches initially, which causes this check to completely ignore printing the rest of the potentially data-filled RecordBatches

@alamb
Copy link
Contributor

alamb commented Jan 17, 2024

Thanks @Jefffrey -- I will make a patch

@alamb
Copy link
Contributor

alamb commented Jan 17, 2024

This appears to have been introduced in #8651

@alamb
Copy link
Contributor

alamb commented Jan 17, 2024

PR ready for review: #8895

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working regression Something that used to work no longer does
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants