-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
More performance improvements with nssdb #137
Conversation
Ok I nailed the second biggest performance issue. |
Eh, sounds like I broke something :-D |
Ok found a copy/paste bug in the reinit() function that caused the failures. |
This will allow to measure performance with some reliability. Signed-off-by: Simo Sorce <simo@redhat.com>
The Zeroize crate performance is crazy slow. The search for a performance issue with the nssdb storage via valgrind's callgrind revelead that 50% of the time in the HMAC code was spent on zeroization because of the way the Zeroize crate is built. Replace it everywhere with a simpler zeromem function that underneath simply calls OPENSSL_cleanse() on the mutable slice passed in. With this change alone the new test that performs 100 pbkdf2 calls went from ~9s to ~4.5s on my laptop. Signed-off-by: Simo Sorce <simo@redhat.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
just nits.
The allocations and zip/iterations were causing another huge performance hit (at least in debug mode). Switch HMAC code to use slices instead, and recode the iterations in the init code to be simpler. Signed-off-by: Simo Sorce <simo@redhat.com>
Nits fixed, and tests pass. Merging. |
Using valgrind's callgrind tool I am analyzing the behavior of with the nssdb storage backend which shows particularly bad performance in pkcs11-provider's CI.
The first easy pick was to remove the Zeroize crate, which single-handedly accounted for a 50% of the time spent in pbkdf2 operations.
The second easy pick was to rewrite the HAM initialization to use fixed slices instead of vectors for the internal state.
The nssdb is still much slower than the others, due to higher use of the pbkdf2 function, yet nssdb uses the same and it is much faster.
Some playing around shows the "native" version is about 20/30 % slower than the OpenSSL one we use in fips mode ... but there is definitely more work to do to improve performance here.