Skip to content

feat: TPC-DS support#212

Merged
clflushopt merged 3 commits intomainfrom
cl/feat/tpcds
Jan 10, 2026
Merged

feat: TPC-DS support#212
clflushopt merged 3 commits intomainfrom
cl/feat/tpcds

Conversation

@clflushopt
Copy link
Owner

This PR adds TPC-DS support which was built initially out of tree in tpcdsgen

@alamb
Copy link
Collaborator

alamb commented Jan 7, 2026

This is on my list to review -- hopefully I'll have time to do it later today

Copy link
Collaborator

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👏 pretty epic @clflushopt --

I took a careful look at the github stuff and gave a cursory review of the other 220 files (!!) and I think we should proceed with merging it and file some follow on tasks

The code I read was well structured, well commented, and I think will be straightforward to maintain / update as needed. Well done.

Some ideas for follow on tasks

  1. Work on conformance testing with C (e.g. #51 (comment))
  2. Improve CI checks (e.g. figure out how to avoid having to maintain all the java setup, etc)
  3. start integrating into the cli

@alamb
Copy link
Collaborator

alamb commented Jan 7, 2026

I noticed several comments about "bugs in the C code" which I assume come from the Java port. It wasn't clear to me of the java people "fixed" the bugs in the C code -- if they did this could be the source of the differences between the C implementation and the rust one

I would personally prefer to follow the C semantics (bugs and all) as that is the 'source of truth' but that can be a discussion we have after his is merged

Thank you again -- this is pretty epic

@clflushopt
Copy link
Owner Author

I noticed several comments about "bugs in the C code" which I assume come from the Java port. It wasn't clear to me of the java people "fixed" the bugs in the C code -- if they did this could be the source of the differences between the C implementation and the rust one

I would personally prefer to follow the C semantics (bugs and all) as that is the 'source of truth' but that can be a discussion we have after his is merged

Thank you again -- this is pretty epic

Actually the documented bugs are not fixed in the Java version and are not fixed in the Rust implementation as well they are just documented.
My hunch tells me that there are more bugs in the way the RNG streams synchronization logic works in the C implementation that were fixed in the Java version "accidently" maybe because of language/runtime differences (probably an overflow/underflow or some signedness non-sense). In any case I will get to the root cause and fix it.

Meanwhile thanks for the review, I will address the items for CI in the same PR (it's easier this way).

@clflushopt
Copy link
Owner Author

clflushopt commented Jan 10, 2026

@alamb I addressed all comments. Thanks again for the review !

Edit: I'll resolve all the comments to merge cleanly

@clflushopt clflushopt merged commit 9c6f9b6 into main Jan 10, 2026
24 checks passed
@comphead
Copy link

One thing to test folks: once the data is generated it would be good to see how many TPCDS queries return non empty results. Currently it is ~15% with blank results, nice if this generator could improve the ratio

@clflushopt clflushopt deleted the cl/feat/tpcds branch January 14, 2026 05:24
@Matt711 Matt711 mentioned this pull request Jan 14, 2026
kevinjqliu added a commit that referenced this pull request Jan 17, 2026
One more file type exclusion after #212 

- xref #214
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants