-
Notifications
You must be signed in to change notification settings - Fork 314
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[BUG] Memory Leak on Secondary Core Power Cycle #9005
Labels
bug
Something isn't working as expected
LNL
Applies to Lunar Lake platform
MTL
Applies to Meteor Lake platform
regression identified
Identified the commit or PR that introduced a regression
Zephyr
Issues only observed with Zephyr integrated
Comments
tmleman
added
bug
Something isn't working as expected
Zephyr
Issues only observed with Zephyr integrated
MTL
Applies to Meteor Lake platform
LNL
Applies to Lunar Lake platform
regression identified
Identified the commit or PR that introduced a regression
labels
Apr 5, 2024
tmleman
added a commit
to tmleman/sof
that referenced
this issue
Apr 5, 2024
This patch refines the initialization process for secondary cores in a multicore environment when using Zephyr as the RTOS. The patch introduces a `check_restore` function specifically for Zephyr, which checks if basic core structures (IDC, notifier, schedulers) have been previously allocated and are still present in memory, indicating that the system is not undergoing a cold boot. By adding this check, the system avoids unnecessary re-allocation of these structures during the power-up sequence of secondary cores, effectively preventing the memory leak observed during repeated power cycle tests. fix thesofproject#9005 Signed-off-by: Tomasz Leman <tomasz.m.leman@intel.com>
tmleman
added a commit
to tmleman/sof
that referenced
this issue
Apr 8, 2024
This patch refines the initialization process for secondary cores in a multicore environment when using Zephyr as the RTOS. The patch introduces a `check_restore` function specifically for Zephyr, which checks if basic core structures (IDC, notifier, schedulers) have been previously allocated and are still present in memory, indicating that the system is not undergoing a cold boot. By adding this check, the system avoids unnecessary re-allocation of these structures during the power-up sequence of secondary cores, effectively preventing the memory leak observed during repeated power cycle tests. fix thesofproject#9005 Signed-off-by: Tomasz Leman <tomasz.m.leman@intel.com>
tmleman
added a commit
to tmleman/sof
that referenced
this issue
Apr 8, 2024
This patch refines the initialization process for secondary cores in a multicore environment when using Zephyr as the RTOS. The patch introduces a `check_restore` function specifically for Zephyr, which checks if basic core structures (IDC, notifier, schedulers) have been previously allocated and are still present in memory, indicating that the system is not undergoing a cold boot. By adding this check, the system avoids unnecessary re-allocation of these structures during the power-up sequence of secondary cores, effectively preventing the memory leak observed during repeated power cycle tests. fix thesofproject#9005 Signed-off-by: Tomasz Leman <tomasz.m.leman@intel.com>
tmleman
added a commit
to tmleman/sof
that referenced
this issue
Apr 9, 2024
This patch refines the initialization process for secondary cores in a multicore environment when using Zephyr as the RTOS. The patch introduces a `check_restore` function specifically for Zephyr, which checks if basic core structures (IDC, notifier, schedulers) have been previously allocated and are still present in memory, indicating that the system is not undergoing a cold boot. By adding this check, the system avoids unnecessary re-allocation of these structures during the power-up sequence of secondary cores, effectively preventing the memory leak observed during repeated power cycle tests. fix thesofproject#9005 Signed-off-by: Tomasz Leman <tomasz.m.leman@intel.com>
lgirdwood
pushed a commit
that referenced
this issue
Apr 9, 2024
This patch refines the initialization process for secondary cores in a multicore environment when using Zephyr as the RTOS. The patch introduces a `check_restore` function specifically for Zephyr, which checks if basic core structures (IDC, notifier, schedulers) have been previously allocated and are still present in memory, indicating that the system is not undergoing a cold boot. By adding this check, the system avoids unnecessary re-allocation of these structures during the power-up sequence of secondary cores, effectively preventing the memory leak observed during repeated power cycle tests. fix #9005 Signed-off-by: Tomasz Leman <tomasz.m.leman@intel.com>
eddy1021
pushed a commit
to eddy1021/sof
that referenced
this issue
Jul 15, 2024
This patch refines the initialization process for secondary cores in a multicore environment when using Zephyr as the RTOS. The patch introduces a `check_restore` function specifically for Zephyr, which checks if basic core structures (IDC, notifier, schedulers) have been previously allocated and are still present in memory, indicating that the system is not undergoing a cold boot. By adding this check, the system avoids unnecessary re-allocation of these structures during the power-up sequence of secondary cores, effectively preventing the memory leak observed during repeated power cycle tests. fix thesofproject#9005 Signed-off-by: Tomasz Leman <tomasz.m.leman@intel.com>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
bug
Something isn't working as expected
LNL
Applies to Lunar Lake platform
MTL
Applies to Meteor Lake platform
regression identified
Identified the commit or PR that introduced a regression
Zephyr
Issues only observed with Zephyr integrated
Describe the bug
A regression has been detected in the power flow code, introduced by commit 5f1e690, causing a memory leak on multicore platforms using Zephyr as the RTOS. The issue manifests as repeated memory allocations for secondary cores that are not freed upon powering down. This results in a gradual depletion of available memory, eventually leading to a firmware exception notification from the DSP. The problem was first identified on the LunarLake platform and is likely to affect other multicore platforms due to the shared power flow code.
To Reproduce
Steps to reproduce the behavior:
Reproduction Rate
The issue is reproducible 10/10 times when following the above manual sequence on the LunarLake platform. The reproduction rate is expected to be consistent across other multicore platforms using Zephyr as the RTOS.
Expected behavior
Upon powering down the secondary cores, the system should either release the resources allocated during the power-up phase or ensure that they are reused during the next power-up.
Impact
This memory leak is a critical issue that leads to resource exhaustion and potential DSP panic after an extended number of power cycles. It is a showstopper for the reliability and stability of the DSP on all affected multicore platforms.
Environment
* SOF: main
* Platform: LunarLake (and potentially all multicore platforms using Zephyr RTOS)
The text was updated successfully, but these errors were encountered: