-
Notifications
You must be signed in to change notification settings - Fork 323
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Downgrade hashing to SHA-1 and other optimizations #5791
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe there should be a single "digest" per each library written to a file in library root. When it exists, all the caches associated with the library shall include that "digest" or otherwise be considered invalid. Everyone who modifies the library shall either remove or recompute the "digest" in the library file.
@@ -2,7 +2,7 @@ | |||
|
|||
import com.oracle.truffle.api.TruffleFile; | |||
import com.oracle.truffle.api.TruffleLogger; | |||
import org.bouncycastle.jcajce.provider.digest.SHA3; | |||
import org.bouncycastle.jcajce.provider.digest.SHA1; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we use bouncycastle and not the standard JVM implementation of the digest?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like the only other place where we use SHA3 is for Buffers. Those could also be downgraded, I believe.
This change downgrades hashing algorithm used in caching IR and library bindings to SHA-1. It is sufficient and significantly faster for the purpose of simple checksum we use it for. Additionally, don't calculate the digest for serialized bytes - if we get the expected object type then we are confident about the integrity. Don't initiallize Jackson's ObjectMapper for every metadata serialization/de-serialization. Initialization is very costly. Avoid unnecessary conversions between Scala and Java. Those back-and-forth `asScala` and `asJava` are pretty expensive. Finally fix a warning when generating library cache.
492bdf0
to
435b11c
Compare
Pull Request Description
This change downgrades hashing algorithm used in caching IR and library bindings to SHA-1. It is sufficient and significantly faster for the purpose of simple checksum we use it for.
Additionally, don't calculate the digest for serialized bytes - if we get the expected object type then we are confident about the integrity.
Don't initialize Jackson's ObjectMapper for every metadata serialization/de-serialization. Initialization is very costly.
Avoid unnecessary conversions between Scala and Java. Those back-and-forth
asScala
andasJava
are pretty expensive.Finally fix an SBT warning when generating library cache.
Closes #5763
Important Notes
The change cuts roughly 0.8-1s from the overall startup.
This change will certainly lead to invalidation of existing caches. It is advised to simply start with a clean slate.
Checklist
Please include the following checklist in your PR:
Scala,
Java,
and
Rust
style guides.
./run ide build
and./run ide watch
.