-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[DocDB] Multiple heap profile calls can race when setting sampling frequency #19841
Closed
1 task done
Labels
Comments
SrivastavaAnubhav
added
area/docdb
YugabyteDB core features
priority/low
Low priority
labels
Nov 3, 2023
davixiao
pushed a commit
to davixiao/yugabyte-db
that referenced
this issue
Nov 27, 2023
Summary: Since commit 9554847, with Google TCMalloc enabled, the /pprof/heap UI endpoint records an allocation profile for the specified number of seconds at a specified sample frequency. The sample frequency is reset to the value at the start of the profile once the profile is complete. This can be problematic if two profiles are run concurrently. For example, 1. Sampling frequency starts at 1 MB 2. Go to `pprof/heap?seconds=10&sample_frequency_bytes=1000` and start profile 1 (sets sampling freq to 1000). 3. Go to `pprof/heap?seconds=1&sample_frequency_bytes=1000` which starts profile 2 (NB: some URL parameter must be different, otherwise it looks like we have some caching that serializes the calls). 4. Call 1 finishes and resets sampling frequency to 1 MB. 5. Call 2 finishes and resets sampling frequency to 1000. This diff prevents this by only allowing one heap profile to run at a time. This diff also removes existing usage of the phrase "allocation profile" in favor of "heap profile", which is used everywhere else in our code / documentation. Jira: DB-8778 Test Plan: `ybd --cxx-test server_pprof-path-handler_util-test --gtest_filter SamplingProfilerTest.OnlyOneHeapProfile`. Also tested manually by testing the steps listed above and verifying that we get the error message. Reviewers: mlillibridge Reviewed By: mlillibridge Subscribers: ybase, esheng, bogdan Tags: #jenkins-ready Differential Revision: https://phorge.dev.yugabyte.com/D29963
SrivastavaAnubhav
added a commit
that referenced
this issue
Nov 28, 2023
Summary: Original commit: 03e91df / D29963 Since commit 9554847, with Google TCMalloc enabled, the /pprof/heap UI endpoint records an allocation profile for the specified number of seconds at a specified sample frequency. The sample frequency is reset to the value at the start of the profile once the profile is complete. This can be problematic if two profiles are run concurrently. For example, 1. Sampling frequency starts at 1 MB 2. Go to `pprof/heap?seconds=10&sample_frequency_bytes=1000` and start profile 1 (sets sampling freq to 1000). 3. Go to `pprof/heap?seconds=1&sample_frequency_bytes=1000` which starts profile 2 (NB: some URL parameter must be different, otherwise it looks like we have some caching that serializes the calls). 4. Call 1 finishes and resets sampling frequency to 1 MB. 5. Call 2 finishes and resets sampling frequency to 1000. This diff prevents this by only allowing one heap profile to run at a time. This diff also removes existing usage of the phrase "allocation profile" in favor of "heap profile", which is used everywhere else in our code / documentation. Jira: DB-8778 Test Plan: `ybd --cxx-test server_pprof-path-handler_util-test --gtest_filter SamplingProfilerTest.OnlyOneHeapProfile`. Also tested manually by testing the steps listed above and verifying that we get the error message. Reviewers: mlillibridge Reviewed By: mlillibridge Subscribers: ybase, esheng, bogdan Tags: #jenkins-ready Differential Revision: https://phorge.dev.yugabyte.com/D30502
SrivastavaAnubhav
added a commit
that referenced
this issue
Nov 28, 2023
Summary: Original commit: 03e91df / D29963 Since commit 9554847, with Google TCMalloc enabled, the /pprof/heap UI endpoint records an allocation profile for the specified number of seconds at a specified sample frequency. The sample frequency is reset to the value at the start of the profile once the profile is complete. This can be problematic if two profiles are run concurrently. For example, 1. Sampling frequency starts at 1 MB 2. Go to `pprof/heap?seconds=10&sample_frequency_bytes=1000` and start profile 1 (sets sampling freq to 1000). 3. Go to `pprof/heap?seconds=1&sample_frequency_bytes=1000` which starts profile 2 (NB: some URL parameter must be different, otherwise it looks like we have some caching that serializes the calls). 4. Call 1 finishes and resets sampling frequency to 1 MB. 5. Call 2 finishes and resets sampling frequency to 1000. This diff prevents this by only allowing one heap profile to run at a time. This diff also removes existing usage of the phrase "allocation profile" in favor of "heap profile", which is used everywhere else in our code / documentation. Jira: DB-8778 Test Plan: `ybd --cxx-test server_pprof-path-handler_util-test --gtest_filter SamplingProfilerTest.OnlyOneHeapProfile`. Also tested manually by testing the steps listed above and verifying that we get the error message. Reviewers: mlillibridge Reviewed By: mlillibridge Subscribers: bogdan, esheng, ybase Tags: #jenkins-ready Differential Revision: https://phorge.dev.yugabyte.com/D30501
jharveysmith
pushed a commit
that referenced
this issue
May 24, 2024
Summary: Original commit: ee4624bb9964dc4202df40e3780e7439a73f6424 / D29963 Since commit 53117be, with Google TCMalloc enabled, the /pprof/heap UI endpoint records an allocation profile for the specified number of seconds at a specified sample frequency. The sample frequency is reset to the value at the start of the profile once the profile is complete. This can be problematic if two profiles are run concurrently. For example, 1. Sampling frequency starts at 1 MB 2. Go to `pprof/heap?seconds=10&sample_frequency_bytes=1000` and start profile 1 (sets sampling freq to 1000). 3. Go to `pprof/heap?seconds=1&sample_frequency_bytes=1000` which starts profile 2 (NB: some URL parameter must be different, otherwise it looks like we have some caching that serializes the calls). 4. Call 1 finishes and resets sampling frequency to 1 MB. 5. Call 2 finishes and resets sampling frequency to 1000. This diff prevents this by only allowing one heap profile to run at a time. This diff also removes existing usage of the phrase "allocation profile" in favor of "heap profile", which is used everywhere else in our code / documentation. Jira: DB-8778 Test Plan: `ybd --cxx-test server_pprof-path-handler_util-test --gtest_filter SamplingProfilerTest.OnlyOneHeapProfile`. Also tested manually by testing the steps listed above and verifying that we get the error message. Reviewers: mlillibridge Reviewed By: mlillibridge Subscribers: bogdan, esheng, ybase Tags: #jenkins-ready Differential Revision: https://phorge.dev.yugabyte.com/D30501
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
Jira Link: DB-8778
Description
Since commit 9554847, with Google TCMalloc enabled, the /pprof/heap UI endpoint records an allocation profile for the specified number of seconds at a specified sample frequency.
The sample frequency is reset to the value at the start of the profile once the profile is complete. This can be problematic if two profiles are run concurrently. For example,
pprof/heap?seconds=10&sample_frequency_bytes=1000
and start profile 1 (sets sampling freq to 1000).pprof/heap?seconds=1&sample_frequency_bytes=1000
which starts profile 2 (NB: some URL parameter must be different, otherwise it looks like we have some caching that serializes the calls).Issue Type
kind/bug
Warning: Please confirm that this issue does not contain any sensitive information
The text was updated successfully, but these errors were encountered: