Skip to content

Conversation

@natashasehgal
Copy link
Contributor

Differential Revision: D87953011

Heidi Han added 4 commits November 25, 2025 14:42
Summary:
Pull Request resolved: facebookincubator#15573

Moving HllAccumulator which was part of HyperloglogAggregates to HllUtils, so that it can be reused in Khyperloglog. HllAccumulator provides the functionality to switch between Sparse and Dense HLLs, along with other functions like merge, insertHash, cardinality which also take care of the 2 versions of HLL (sparse and dense), which is also needed for the implementation of KHLL.

Differential Revision: D87486444
)

Summary:
Pull Request resolved: facebookincubator#15594

Adding some functionalities to HllAccumulator to accomodate KHLL
- mergeWith which takes in a deserialized HllAccumuator
- template typename TAllocator, to allow usage of both HashStringAllocator and MemoryPool, with the default set to use HashStringAllocator (as Sparse and Dense Hll does)

Differential Revision: D87591323
@netlify
Copy link

netlify bot commented Nov 27, 2025

Deploy Preview for meta-velox canceled.

Name Link
🔨 Latest commit e82ebe3
🔍 Latest deploy log https://app.netlify.com/projects/meta-velox/deploys/69280763a6e8150008590ec3

@meta-cla meta-cla bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Nov 27, 2025
Natasha Sehgal and others added 2 commits November 26, 2025 23:29
Summary:
This adds a KHyperLogLogInputGenerator for fuzzing KHyperLogLog functions. The generator creates valid serialized KHyperLogLog structures by:

1. Randomly selecting a base type (BIGINT, VARCHAR, DOUBLE, or UNKNOWN)
2. Generating random join keys and UII pairs based on the base type
3. Adding them to a KHyperLogLog instance with random maxSize and hllBuckets
4. Serializing the result as VARBINARY

The generator supports configurable null ratio and minimum number of values, making it suitable for comprehensive fuzzing of KHyperLogLog-related functions. This enables better test coverage for the KHyperLogLog scalar UDFs added in the stack.

Differential Revision: D87950763
Differential Revision: D87100364
@meta-codesync
Copy link

meta-codesync bot commented Nov 27, 2025

@natashasehgal has exported this pull request. If you are a Meta employee, you can view the originating Diff in D87953011.

natashasehgal added a commit to natashasehgal/velox that referenced this pull request Nov 27, 2025
Summary:
Pull Request resolved: facebookincubator#15651

Add KHLL result verifier for HLL aggregate functions
Deserializes KHLL sketches and compares cardinality estimates
Allow 5% relative error tolerance

Differential Revision: D87953011
Summary:
Pull Request resolved: facebookincubator#15651

Add KHLL result verifier for HLL aggregate functions
Deserializes KHLL sketches and compares cardinality estimates
Allow 5% relative error tolerance

Differential Revision: D87953011
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. fb-exported meta-exported

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant