Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(storage): support dict encoding #668

Merged
merged 5 commits into from
Jul 26, 2022
Merged

Conversation

Kikkon
Copy link
Contributor

@Kikkon Kikkon commented Jul 16, 2022

Signed-off-by: kikkon nian920@outlook.com

Close #639

Signed-off-by: kikkon <nian920@outlook.com>
Signed-off-by: kikkon <nian920@outlook.com>
@Kikkon
Copy link
Contributor Author

Kikkon commented Jul 16, 2022

I think I got into trouble in this pr. I used a HashMap to implement the dict encoding, but in primitive_column_builder, I can't implement eq trait for f64.
Maybe I can refer to this post using ordered-float: https://internals.rust-lang.org/t/f32-f64-should-implement-hash/5436/4
any other better suggestions?

@skyzh skyzh self-requested a review July 16, 2022 16:50
@skyzh
Copy link
Member

skyzh commented Jul 20, 2022

I think I got into trouble in this pr. I used a HashMap to implement the dict encoding, but in primitive_column_builder, I can't implement eq trait for f64.

Yes, I'd be totally agree on that! We'd better change all f64 occurrences in RisingLight to use ordered_float. I guess it would be a large refactor, so we can simply ignore float for now, and support it later.

Copy link
Member

@skyzh skyzh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rest LGTM, good work!

proto/src/proto/rowset.proto Outdated Show resolved Hide resolved
src/storage/secondary/block/dict_block_builder.rs Outdated Show resolved Hide resolved
Signed-off-by: kikkon <nian920@outlook.com>
@Kikkon Kikkon changed the title [WIP]feat(storage): support dict encoding feat(storage): support dict encoding Jul 21, 2022
@Kikkon Kikkon requested review from skyzh and xxchan July 22, 2022 01:50
Copy link
Member

@skyzh skyzh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rest LGTM! For the next step, we can see if dict encoding can help us save space for TPC-H workloads.

src/storage/secondary/block/dict_block_builder.rs Outdated Show resolved Hide resolved
@skyzh
Copy link
Member

skyzh commented Jul 26, 2022

Also we can consider using dict encoding when distinct_value statistics is small for a given column. This would be very helpful in the future.

Signed-off-by: kikkon <nian920@outlook.com>
@skyzh skyzh enabled auto-merge (squash) July 26, 2022 15:38
@skyzh skyzh merged commit ddf9971 into risinglightdb:main Jul 26, 2022
@Kikkon Kikkon deleted the kikkon_dev branch August 21, 2022 03:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

storage: support dict encoding
3 participants