-
Notifications
You must be signed in to change notification settings - Fork 11
Closed
Description
Description
I'd like to contribute an implementation of ReservoirLongsSketch to the Go library. This would address the ❌ status for "ReservoirLongsSketch" in the README.
Proposed Implementation
Based on the Java reference implementation, I've created:
| File | Description |
|---|---|
sampling/reservoir_longs_sketch.go |
Core reservoir sampling for int64 values |
sampling/reservoir_longs_union.go |
Union for merging multiple sketches |
sampling/reservoir_longs_sketch_test.go |
Unit tests (11 tests) |
examples/reservoir_example_test.go |
Usage examples |
Algorithm
The classic Reservoir Sampling algorithm (Vitter's Algorithm R):
- Initial Phase (n < k): Store all items
- Steady State (n ≥ k): Replace random item with probability k/n
API
// Create sketch with capacity k
sketch, _ := sampling.NewReservoirLongsSketch(10)
// Add items
sketch.Update(42)
// Get uniform random sample
samples := sketch.GetSamples()
// Serialization
bytes, _ := sketch.ToByteArray()
restored, _ := sampling.NewReservoirLongsSketchFromSlice(bytes)Feedback Requested
I have a working implementation ready. Before submitting a PR, I'd appreciate feedback on:
- Serialization Format: I followed the general pattern from
PreambleUtil.java. Should I verify cross-language compatibility with specific test cases? - Scope: Should I include
ReservoirItemsSketch<T>(generic version) in the same PR, or keep it as a separate contribution? - Any design concerns with the current approach?
Testing
All tests pass locally:
go test -v ./sampling/ ./examples/
# 13 tests pass
I'm happy to adjust the implementation based on your feedback!
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels