-
Notifications
You must be signed in to change notification settings - Fork 851
Origin session cache mem leak fix #7883
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
maskit
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changes around i2d_SSL_SESSION and exdata overwhelmed me. Can you tell me why we need to simplify those parts on this PR that is to fix memory leak?
ssl_session_cache_exdata is used for client sessions as well. Removing it from the both client session and origin session on a PR would make more sense than removing it only from origin session with a fix for memory leak.
I guess I could approve this if I was in a good mood but there was enough number of PRs for Origin Session Cache to make me picky.
| { | ||
| } | ||
|
|
||
| ~SSLOriginSession() { SSL_SESSION_free(session); } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Calling SSL_SESSION_free here is the fix, and other changes are not, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not really, this is freeing the session when it's being replaced or removed from the cache, as we're not doing the i2d_SSL_SESSION conversion and returning the correct return value through the new session callback.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The reason for the leak is because when we do the i2d_SSL_SESSION for the server sessions, we always return 0 so the reference count for the actual OpenSSL's session object is decreased after returning from the callback, and we do some manual cleanups in our NetVC if we convert the session object back. But we seem to not do that for the origin sessions, and just rely on OpenSSL's own ref counter to clean up things.
|
@duke8253 is going to take a look at the comment above. |
|
As mentioned above, we didn't follow OpenSSL's documentation on how we handle the new session callback. I tried making the callback for origin sessions do the same thing as server sessions, but looks like we deal with origin session differently too so the memory leak still happens even if I the call back returns 0 all the time, seems like we rely on openssl's own freeing mechanism on the origin side but not the server side. That's why I removed the whole |
|
My understanding of Removing the serialization part is fine if it's unnecessary, but is it only possible for origin sessions?
Is this only true for origin sessions and we still need to pass it as a pointer to a structure for server sessions? |
Correct, but when I tried to do the same for server sessions, weird things starts to happen: the sessions can only be reused once, and if I remember correctly new leaks started to happen after doing that, I'll have to check my notes on that. Which is why I'm saying we are doing special treatments in NetVC I'm not aware of. So for now removing the serialization can only be applied to origin sessions.
This is true for both origin sessions and server sessions, I didn't touch it in the server sessions as I'm not sure what is causing the problem I mentioned above. I'll make a PR once I figure out what's causing the problem. |
maskit
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wanted to see a bug fix and cleanup separately, but it's probably not worth leaving memory leak for long time.
After running #7808 in our production for a while, we noticed there are memory leakage. Turns out ATS isn't following OpenSSL's documentation on how we store sessions externally, namely our
new_session_cb(https://www.openssl.org/docs/manmaster/man3/SSL_CTX_sess_set_get_cb.html) always returns 0. So using the same method with the origin session cache caused the leak.Also, simplified the way we store ex_data, which is just a
curve_id, not sure why we created structssl_session_cache_exdatato just store this integer value.Mentioning #7479 to link all related PRs.