-
-
Notifications
You must be signed in to change notification settings - Fork 1.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fulfill https://github.com/nim-lang/Nim/pull/14995#issuecomment-66491… #15104
Conversation
request. This can be conceived as an alternate, more capable resolution of nim-lang#12200 than nim-lang#12208 The code re-org idea here is to upgrade tablimpl.nim:`delImpl`/`delImplIdx` to abstract client code conventions for cell emptiness & cell hashing via three new template arguments - `makeEmpty`, `cellEmpty`, `cellHash` which all take a single integer argument and clear a cell, test if clear or produce the hash of the key stored at that index in `.data[]`. Then we update the 3 call sites (`Table`, `CountTable`, `SharedTable`) of `delImpl`/`delImplIdx` by defining define those arguments just before the first invocation as non-exported templates. Because `CountTable` does not save hash() outputs as `.hcode`, it needs a new tableimpl.nim:`delImplNoHCode` which simply in-lines the hash search when no `.hcode` field is available for "prefix compare" acceleration. It is conceivable this new template could be used by future variants, such as one optimized for integer keys where `hash()` and `==` are fast and `.hcode` is both wasted space & time (though a small change to interfaces there for a sentinel key meaning "empty" is needed for maximum efficiency). We also eliminate the old O(n) `proc remove(CountTable...)` in favor of simply invoking the new `delImpl*` templates and take care to correctly handle the case where `val` is either zero for non-existent keys in `inc` or evolves to zero over time in `[]=` or `inc`. The only user-visible changes from the +-42 delta here are speed, iteration order post deletes, and relaxing the `Positive` constraint on `val` in `proc inc` again, as indicated in the `changelog.md` entry.
While I think it's fairly self-evident from source code and algos, some like to see benchmarks. These are based on #14995 (comment) but reduced. import tables, times
template timeIt(label1, label2, value2, scale, body) =
var t0 = epochTime()
body
let dt = epochTime() - t0
echo label1, dt * 1e9 / scale, label2, value2
proc runtest(els: int) =
var ot = initCountTable[int](els)
timeIt("Create " & $(els div 1_000) & "Kitem table (ns/elt): ",
" len ", ot.len, float(els)):
for i in 0 ..< els: ot[i] = i
timeIt("Delete99% (ns/elt): ", " len ", ot.len, float(els)):
for i in 0 ..< els:
if i mod 100 != 0: ot.del(i)
for els in @[ 10_000, 50_000 ]: runtest(els) On an (Intel i7-6700k@4.7GHz, Linux, nim bbb250f, d:danger, etc.), I get these results.
After (manually rounded times to 0.1 ns):
So, we see deletes get 78735/28 =~ 2800x faster and 457257/14 =~ 33,000x faster. Obviously, since it's O(n) expected vs O(1) that could be made as big as you want. (It's 1.7x and 1.5x less than 5000x and 10,000x since branch prediction/pre-fetching/etc.) A careful eye might notice inserts getting faster across the PR, though no relevant code changed. I am not sure what that is, but suspect |
All tests pass. Also, if you would like we could put a 15-20 line explanatory note in front of |
More comments are always nice but can be done in a follow-up PR. I'm merging this already because it's so beautiful. :-) |
Ok. Thanks. :-) |
request. This can be conceived as an alternate, more capable resolution of #12200 than #12208 The code re-org idea here is to upgrade tablimpl.nim:`delImpl`/`delImplIdx` to abstract client code conventions for cell emptiness & cell hashing via three new template arguments - `makeEmpty`, `cellEmpty`, `cellHash` which all take a single integer argument and clear a cell, test if clear or produce the hash of the key stored at that index in `.data[]`. Then we update the 3 call sites (`Table`, `CountTable`, `SharedTable`) of `delImpl`/`delImplIdx` by defining define those arguments just before the first invocation as non-exported templates. Because `CountTable` does not save hash() outputs as `.hcode`, it needs a new tableimpl.nim:`delImplNoHCode` which simply in-lines the hash search when no `.hcode` field is available for "prefix compare" acceleration. It is conceivable this new template could be used by future variants, such as one optimized for integer keys where `hash()` and `==` are fast and `.hcode` is both wasted space & time (though a small change to interfaces there for a sentinel key meaning "empty" is needed for maximum efficiency). We also eliminate the old O(n) `proc remove(CountTable...)` in favor of simply invoking the new `delImpl*` templates and take care to correctly handle the case where `val` is either zero for non-existent keys in `inc` or evolves to zero over time in `[]=` or `inc`. The only user-visible changes from the +-42 delta here are speed, iteration order post deletes, and relaxing the `Positive` constraint on `val` in `proc inc` again, as indicated in the `changelog.md` entry. (cherry picked from commit b2a1944)
request. This can be conceived as an alternate, more capable resolution of nim-lang#12200 than nim-lang#12208 The code re-org idea here is to upgrade tablimpl.nim:`delImpl`/`delImplIdx` to abstract client code conventions for cell emptiness & cell hashing via three new template arguments - `makeEmpty`, `cellEmpty`, `cellHash` which all take a single integer argument and clear a cell, test if clear or produce the hash of the key stored at that index in `.data[]`. Then we update the 3 call sites (`Table`, `CountTable`, `SharedTable`) of `delImpl`/`delImplIdx` by defining define those arguments just before the first invocation as non-exported templates. Because `CountTable` does not save hash() outputs as `.hcode`, it needs a new tableimpl.nim:`delImplNoHCode` which simply in-lines the hash search when no `.hcode` field is available for "prefix compare" acceleration. It is conceivable this new template could be used by future variants, such as one optimized for integer keys where `hash()` and `==` are fast and `.hcode` is both wasted space & time (though a small change to interfaces there for a sentinel key meaning "empty" is needed for maximum efficiency). We also eliminate the old O(n) `proc remove(CountTable...)` in favor of simply invoking the new `delImpl*` templates and take care to correctly handle the case where `val` is either zero for non-existent keys in `inc` or evolves to zero over time in `[]=` or `inc`. The only user-visible changes from the +-42 delta here are speed, iteration order post deletes, and relaxing the `Positive` constraint on `val` in `proc inc` again, as indicated in the `changelog.md` entry.
(nim r tables; nim r sharedtables)
work. Seems ready for CI testing. (This replacesthe now defunct #15103)