Skip to content
This repository has been archived by the owner on Jul 30, 2024. It is now read-only.

BloomFilter (and ScalableBF) should support set operations (intersection, union ... ) #2

Open
stevvooe opened this issue Jan 12, 2010 · 2 comments

Comments

@stevvooe
Copy link

I am not particularly sure that the backend algorithm can reliably support this, but it would be nice if the BloomFilter implementation could support the operations of the set primitive. Specifically:

>>> a = BloomFilter(10)
>>> a.add(1)
>>> b = BloomFilter(10)
>>> b.add(2)
>>> c = a | b
>>> 1 in c
True
>>> 2 in c
True
>>> 3 in c
False

I find sets to be incredibly useful in general coding practice, as it allows one to work naturally on data. Having a proxy to a set of data, without having to load said data, would increase the size of problems that one could easily work on, especially when one is more interested in set membership than the actual members.

This is just a feature suggestion (not even a request really), so no hurry. If I have time, I might even write a patch. I am unfamiliar with the hash function generation and the nuances there, so thats why I haven't jumped right into it.

@jaybaird
Copy link
Owner

I just pushed a new branch (set-operations) that adds union and intersection to BloomFilter. Take a look and see what you think. I'm still looking into adding the support into SBF, but that's a bit more of a Harder Problem.

I'm gonna get this code reviewed and checked into master ASAP.

Jay

@jaybaird
Copy link
Owner

Ok, this was kind of a junk patch. It requires that your bloom filters be the same capacity and error rate.

I'm working around that now. Sorry about that.

etrepum pushed a commit to etrepum/python-bloomfilter that referenced this issue Dec 26, 2012
…lter, not yet available in SBF. Bumped version number to 1.1
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants