Skip to content

Conversation

@natashasehgal
Copy link
Contributor

@netlify
Copy link

netlify bot commented Nov 25, 2025

Deploy Preview for meta-velox canceled.

Name Link
🔨 Latest commit d2249db
🔍 Latest deploy log https://app.netlify.com/projects/meta-velox/deploys/692604e625c473000782bde5

@meta-cla meta-cla bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Nov 25, 2025
@meta-codesync
Copy link

meta-codesync bot commented Nov 25, 2025

@natashasehgal has exported this pull request. If you are a Meta employee, you can view the originating Diff in D87832343.

Heidi Han and others added 5 commits November 25, 2025 09:52
Summary:
Pull Request resolved: facebookincubator#15573

Moving HllAccumulator which was part of HyperloglogAggregates to HllUtils, so that it can be reused in Khyperloglog. HllAccumulator provides the functionality to switch between Sparse and Dense HLLs, along with other functions like merge, insertHash, cardinality which also take care of the 2 versions of HLL (sparse and dense), which is also needed for the implementation of KHLL.

Differential Revision: D87486444
)

Summary:
Pull Request resolved: facebookincubator#15594

Adding some functionalities to HllAccumulator to accomodate KHLL
- mergeWith which takes in a deserialized HllAccumuator
- template typename TAllocator, to allow usage of both HashStringAllocator and MemoryPool, with the default set to use HashStringAllocator (as Sparse and Dense Hll does)

Differential Revision: D87591323
Summary:
Pull Request resolved: facebookincubator#15590

## Main Methods

| **Java Method**                                   | **C++ Method**                                   | **Description**                                                        |
|---------------------------------------------------|--------------------------------------------------|-----------------------------------------------------------------------|
| `SetDigest()`                                     | `SetDigest(HashStringAllocator*)`                | Constructor - Java uses default params, C++ requires allocator        |
| `void add(long)`                                  | `void add(int64_t)`                              | Add integer value to digest                                           |
| `void add(Slice)`                                 | `void add(StringView)`                           | Add string/binary value to digest                                     |
| `Slice serialize()`                               | `void serialize(char*)`                          | Serialize to bytes (Java returns Slice, C++ writes to buffer)         |
| `static SetDigest newInstance(Slice)`             | `void deserialize(const char*, int32_t)`         | Deserialize from bytes (Java static factory, C++ instance method)     |
| `void mergeWith(SetDigest)`                       | `void mergeWith(const SetDigest&)`               | Merge two digests                                                     |
| `boolean isExact()`                              | `bool isExact() const`                           | Check if digest is exact or approximate                               |
| `long cardinality()`                              | `int64_t cardinality() const`                    | Get distinct element count                                            |
| `static long exactIntersectionCardinality(...)`    | `static int64_t exactIntersectionCardinality(...)`| Calculate exact intersection size                                     |
| `static double jaccardIndex(...)`                 | `static double jaccardIndex(...)`                | Calculate Jaccard similarity coefficient                              |
| `Map<Long, Short> getHashCounts()`                | `std::map<int64_t, int16_t> getHashCounts() const`| Get MinHash map                                                      |

 ---

## Size Estimation Functions

| **Java Method**                  | **C++ Method**                        |
|----------------------------------|---------------------------------------|
| `int estimatedSerializedSize()`  | n/a  |
| `int estimatedInMemorySize()`    | `int32_t estimatedInMemorySize() const`|

 ---

## Additional Methods

| **Java Only**                                 | **C++ Only**                        |
|-----------------------------------------------|-------------------------------------|
| `HyperLogLog getHll()`                        | `void setIndexBitLength(int8_t)`    |
| `SetDigest(int, int)` constructor             |                                     |
| `void convertToDense()` (private)             |                                     |
| `SetDigest(int, HLL, Map)` constructor        |                                     |

Differential Revision: D87376975
…15619)

Summary:
Pull Request resolved: facebookincubator#15619

feat: Add SetDigest result verifier to be used in functions with SetDigest output

Differential Revision: D87835419
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. fb-exported meta-exported

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant