-
Notifications
You must be signed in to change notification settings - Fork 30
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Stress test adding data to the tree #16
Comments
That's right. I also had this idea last night. We should also add tests which involve all features of the library as well. |
As @arlolra pointed out by mail this is also about profiling:
|
Here is a list of missing tests (per package): merkletree
crypto:
utils:
In general only the success cases are well tested. Branches where we Just wanted to leave this here for discussion. |
This seems like an issue to me. I think we should definitely be covering failure cases as well. This will be especially important if/when we create a common test suite for Another thing I wanted to open up for discussion was whether we should be using a code coverage measurement tool like Coveralls or Codecov.io? These are the main ones I've seen, I'm sure there are more out there. |
Couldn't hurt, and goveralls seems pretty straightforward. |
Cool! I've authorized Coveralls to access the coniks-sys repos. I'll go set up goveralls. |
Good idea. Also thought about that. After the changes the code coverage is already quite high (90 -100%). I'm not necessarily a friend of aiming for 100% code coverage, though. But goveralls seems like a good idea anyways! |
So I've set up goveralls. I wasn't able to get it to work without testing each package and subpackage individually (because goveralls doesn't support multiple go packages), so now we just have to remember to add each new package to the YAML file. |
Besides some tests, I've added Benchmarks in #43 for a first indicator if sth. is slow (and needs to be profiled). It's still pretty much WIP but feel free to have an early look into it and/or test it on your machine(s). Maybe the first benchmark ( |
Nice work @masomel |
I think we can replace this test with tree clone tests? Since this task must be performed in each epoch. I imagine the scenario would be creating an empty tree then inserting a lot of records (e.g., 1000 per epoch) then cloning the tree; repeating this task for a large number of time.
Maybe I didn't get your idea completely. But I think we do need to perform the lookup from memory, not only to retrieve the user's data blob but also to check if the given user is on the tree (i.e., registered with the keyserver).
Great work!
I will give it a try on my machine. |
Should we perform the evaluation as described in the paper section 5? |
Sounds good.
Makes sense to me. Parts of the evaluation deal with network communication, though. In order for this to be realistic we could also consider other means (derterLab/similar test beds). |
I though there must be a way to do a single lookup without holding the whole tree in memory (but I didn't think it through). Currently it seems like creating/querying/updating a large tree takes very long and consumes a lot of memory (even with quite modest tree sizes with only 100000 entries; also I don't have 64 GB of RAM as per section 5 ;-)). I'll investigate this further in the following week. Maybe the Benchmarks I'm doing aren't the best/right approach, too. |
I did some profiling (when benchmarking pad.Update in tree with 100,000 entries). Update: I have to add, that this makes sense as most of the time is spent creating the 100K directory (cpu/memory-wise) where the VRF-operations and Digest are called often.
update: cpu usage (update op. only): https://imagebin.ca/v/2q5DJGSE651d (all files can be downloaded as .svg files and viewed in a browser) |
I'll run some benchmarks on my lab machine, hopefully early this week and can report the results back to you, too. Really nice work on the benchmarks so far, @liamsi! Would be cool to post some results in a wiki at some point.
If you'd like, I could try to run some benchmarks on a (similar) setup to the one in the paper. We ran them on one of the servers in my group's cluster, and I could probably get that set up again. |
I think (if there aren't further suggestions what should be benchmarked/profiled) this is ready to be reviewed in PR #43. My conclusion(s) regarding the profiling: We should refactor the memory intensive hashing (currently done by
@masomel I think that makes a lot of sense to compare both implementations but this isn't very urgent I guess. |
Can you make it clearer? I think there is not much memory allocation here, except when we call P.S. This pull is a great/cool work! Thanks @liamsi |
I thought all data is collected and than |
Using We should also keep in mind that we want devs to be able to specify their own hashing scheme (see #50). |
Pushed the changes from the above comment to branch hash_experiment in d128e43 (in case you want to reproduce the benchmark/comparison) |
No, it's not particularly urgent, especially since the Java implementation we used for our benchmarks in the paper is now out of date. |
closed by #54 |
5ff85ad is an indication that we should have a test that generates a large number of key/val pairs to set and retrieve from the tree.
The text was updated successfully, but these errors were encountered: