Skip to content

Conversation

@duke8253
Copy link
Contributor

@duke8253 duke8253 commented Nov 5, 2021

We've been seeing some high memory utilization after turning on origin session cache in production. After some investigation, I find that the cause is due connections (anywhere from 20~50) could be using the same session at the same time. And since the sessions are stored in a way such that when a new connections requests for a session, it allocates a new one for that connection (to prevent double freeing when connection closes, and avoid using OpenSSL's internal ref-counter to have less ambiguity), it uses a lot more memory than what I expected.

Some calculations here:
Session objects for origin sessions are limited to 4KB each, default cache pool size is 10240, the 50x multiplication factor increase the size from 40MB to 2GB. Not too big of a problem on hosts with lots of memory, but for some smaller ones it is bad news.

The changes in this PR uses shared pointer to share sessions among connections, this will help with high memory utilization. It also eliminates d2i_SSL_SESSION/i2d_SSL_SESSION conversions all together, so there's a slight performance improvement as well.

@duke8253 duke8253 added this to the 10-Dev milestone Nov 5, 2021
@duke8253 duke8253 self-assigned this Nov 5, 2021
@duke8253 duke8253 requested a review from bryancall as a code owner November 5, 2021 20:27

// Free the shared pointer we're holding right now to make sure the only
// reference is inside the "SSLOriginSession" object we just created
shared_sess.reset();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand this. Isn't is a local variable that will go out of scope at the end of the method?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left over code from some testing I was doing, just making sure everything is released where it's supposed so I can make sure the ref-count is correct.

@duke8253 duke8253 force-pushed the master-orig_sess_share branch 2 times, most recently from 23b53d5 to f8720f6 Compare November 8, 2021 16:50
@duke8253 duke8253 force-pushed the master-orig_sess_share branch from baa3cd1 to 42ebd81 Compare November 8, 2021 17:09
@bryancall bryancall added the TLS label Nov 9, 2021
@duke8253 duke8253 merged commit 664e80e into apache:master Nov 9, 2021
@maskit
Copy link
Member

maskit commented Nov 17, 2021

This doesn't compile with BoringSSL 😢 SSL_SESSION_dup is unavailable. We need either an equivalent implementation or the old way. SSL_SESSION_dup was added in OpenSSL 1.1.1. We may need it for OpenSSL 1.0.2 as well.

Also, the milestone is set to 10-Dev but it is merged to master. I think it should have been merged to 10-Dev branch if I understand the rule correctly.

@duke8253 duke8253 modified the milestones: 10-Dev, 10.0.0 Jan 25, 2022
zwoop pushed a commit that referenced this pull request Jan 25, 2022
@zwoop zwoop modified the milestones: 10.0.0, 9.2.0 Jan 25, 2022
@zwoop
Copy link
Contributor

zwoop commented Jan 25, 2022

Cherry-picked to v9.2.x

moonchen pushed a commit to moonchen/trafficserver that referenced this pull request Mar 17, 2022
* asf/9.2.x:
  Updated ChangeLog
  Add SSLSessionDup for older OpenSSL and BoringSSL (apache#8578)
  use shared pointer to help with high memory utilization (apache#8498)
  Commenting TSHttpTxnCacheLookupStatusGet need_to_revalidate (apache#8621)
  check size of session, and free sessions the ATS way (apache#8330)
  free sessions when timeout (apache#8356)
  Fix 32bit build failure on Odroid Xu-4 (apache#8626)
  TSHttpTxnCacheLookupStatusGet: call need_to_revalidate (apache#8617)
  SNIConfig (tunnel_route): Change the way we extract matched subgroups from the server name. (apache#8589)
  fix for collapsed forwarding ink_abort for CacheHitFresh fail (apache#8613)
  Do not turn off cache for internal requests (apache#8266)
  Rate Limit Plugin: Re-enable VConnection when SNI is empty (apache#8625)
  Removes hard dependency on having perl installed (apache#8611)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants