Skip to content

Fix the ICSR mask value used to enter WFE #8224

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 0 commits into from
Closed

Fix the ICSR mask value used to enter WFE #8224

wants to merge 0 commits into from

Conversation

TacoGrandeTX
Copy link
Contributor

Description

The current value of SCB_ICSR_RESERVED_BITS_MASK is not restrictive enough and will prevent the core from entering WFE if any interrupt is pending, even if it is not enabled. This can happen rather easily as one might lazily disable a low priority interrupt and not notice that the pending bit was set. This will forever prevent the core from sleeping. In fact this goes even deeper as architecturally an interrupt can go pending even if it is disabled in the NVIC.

We should just be checking against the VECTPENDING and VECTACTIVE bit fields in the Interrupt Control and State Register (ICSR). When VECTPENDING is non-zero it indicates which interrupt is pending and enabled. We don't want to enter WFE in an exception handler, so we also include VECTACTIVE in the bit mask.

This stems from #5647.

Pull request type

[x] Fix
[ ] Refactor
[ ] Target update
[ ] Functionality change
[ ] Breaking change

@TacoGrandeTX
Copy link
Contributor Author

TacoGrandeTX commented Sep 23, 2018

@RobMeades @NeilMacMullen This has the potential to reduce power consumption when the SoftDevice is disabled

@0xc0170 0xc0170 requested review from a team September 24, 2018 06:33
@mprse
Copy link
Contributor

mprse commented Sep 24, 2018

I have problem with tests-mbed_hal-sleep test on NRF52_DK which might be related to this issue/fix.

The test regularly pass CI, but fails if you run it locally mbed test -t GCC_ARM -m NRF52_DK -n tests-mbed_hal-sleep -v (83fc2e2). The difference between CI and local run is that CPU statistics are enabled on CI and if you run locally with enabled stats the test will pass. Probably this is because additional time which CPU needs to handle stats.

The sleep test case perform sleep for 100 us, 200 us, ... ,1000 us in loop (us ticker wakes the boards) and verifies if sleep time matches the assumption.
I got the following results:

sleep                 wake-up after
100 us                   ~100 us    ok
200 us                   ~200 us    ok
300 us                   ~300 us    ok
400 us                   ~400 us    ok
500 us                   ~14 us     (??)

In the failure case (500 us) us ticker interrupt has not been triggered yet and it looks like the board did not even enter sleep mode. This is because in the Interrupt Control and State Register ISRPENDING bit is set (0x00400000). I have no idea what could cause that there is an interrupt pending. The only reason I can think of is systick, but I have disabled OS in the test (osKernelSuspend();). @TacoGrandeTX Do you maybe have some hints?

I tried your fix and the the results are the same... In case of failure wake-up time is also ~14 us.

Patch can be found here: PR #8257

@TacoGrandeTX
Copy link
Contributor Author

@mprse - I will try to look at this today! Unfortunately I am out-of-office the rest of the week, so apologies if I have a delayed response as I have other items that need closure.

if (SCB->ICSR & (SCB_ICSR_RESERVED_BITS_MASK)) {
// Ok, there is an interrut pending, no need to go to sleep
// Test if we should enter WFE
if (SCB->ICSR & (SCB_ICSR_WFE_MASK)) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the ICSR check needed? If an interrupt is pending shouldn't __WFE() just clear the event register and return immediately?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not confident enough to say that we don't need the ICSR check but I think it's prudent. At the minimum it prevents entry to sleep while in an exception handler (either accidentally or intentionally) by including VECTACTIVE in the mask.

@kjbracey
Copy link
Contributor

kjbracey commented Oct 4, 2018

Possibly dumb question (not really familiar with the nRF devices) - why aren't you just using WFI here?

@kjbracey
Copy link
Contributor

kjbracey commented Oct 4, 2018

As far as I can see, this is being complicated by the fact you're fighting WFE to use it for something it's not designed for. Roughly speaking WFE is for user-code spinlocks when interrupts are enabled, and WFI is for inside-the-OS power saving code when interrupts are disabled. I would expect a simple single "WFI" to work fine here, with no messing with SEV or the ICSR. Also wouldn't need to set SCR.SEVONPEND any more.

See FAQ http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.faqs/ka15473.html and my previous waffle at https://os.mbed.com/forum/mbed/topic/29547/?page=1#comment-55419

Also, from http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.dai0321a/BIHICBGB.html and other sources, there should be a DSB before WFI (or WFE).

I see most targets are using WFI, but I do see a few using "SEV; WFE; WFE;" like this. Atmel_SAM_R4 apparently uses WFI for sleep and WFE for deep sleep!

@0xc0170 0xc0170 requested a review from a team October 4, 2018 15:59
@kjbracey
Copy link
Contributor

kjbracey commented Oct 5, 2018

Okay, I get it - this is a special case because core_util_enter_critical has been hijacked to just mask "application" interrupts, and PRIMASK is actually 0 on entry. WFI is not useable if PRIMASK is 0.

In that case, I think the sequence DSB; SEV; WFE; WFE should do fine to flush out a pending event, and wait for an actual interrupt. Testing the ICSR manually is not likely to be any faster than just executing the second WFE and letting it complete immediately.

Seems safe as long as you know you're single-CPU, so won't be waking any other cores with the SEV and you know no other cores are trying to wake you, and as long as the SoftDevice code only contacts you via the masked IRQs that you're catching via the SEVONPEND.

@TacoGrandeTX
Copy link
Contributor Author

TacoGrandeTX commented Oct 5, 2018

@kjbracey-arm - Correct - one of the subtle differences between WFE and WFI is the ability to wake up the core via SEVONPEND. It's possible that the core will wake up from WFE and not process the pended interrupt if it's not of high enough priority. That is regardless of the state of PRIMASK.

I agree that we should add the DSB and will do so. It's not usually required for Cortex-M4 cores but Nordic may have added an undocumented system level write buffer (see Section 4.14 here http://infocenter.arm.com/help/topic/com.arm.doc.dai0321a/DAI0321A_programming_guide_memory_barriers_for_m_profile.pdf), so I will add it for architectural correctness.

@kjbracey
Copy link
Contributor

kjbracey commented Oct 8, 2018

Getting off on a tangent, but another possible approach would be:

 void hal_sleep() {
 // enter with application IRQs disabled
 while (no application IRQs pending in ICSR)  {
      __disable_irq(); // for real, not mbed_enter_critical
      mbed_exit_critical(); // unmask the application IRQs
      DSB;
      WFI;
      mbed_enter_critical(); // mask the application IRQs
      __enable_irq();
      // might need some NOPpy stuff here to give softdevice IRQs time to run between back-to-back enable/disable.
      // Not sure what architecture says for minimum.
 } 
 }

That would avoid "waking" the application side and its RTOS every time there's a softdevice IRQ.

That's assuming a lot of stuff I don't know about your system, eg that softdevice does all its work from IRQs, and that it's worthwhile not waking the application side for softdevice.

@cmonr
Copy link
Contributor

cmonr commented Oct 9, 2018

@jrobeson Do you think you could comment on this proposed fix?

Copy link
Contributor

@cmonr cmonr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Change looks good to me.
Love the deep technical back and forth.

@cmonr cmonr requested a review from a team October 9, 2018 19:14
@cmonr
Copy link
Contributor

cmonr commented Oct 9, 2018

Would like one more OK from @ARMmbed/mbed-os-core since this change is so deep.

@cmonr
Copy link
Contributor

cmonr commented Oct 9, 2018

@TacoGrandeTX Let us know if there's any pressure to get this in sooner than the next feature release.
Since it's a fix, it should be OK to bring in sooner, but I'd rather have this evaluated as a part of OOB coverage.

@ghost
Copy link

ghost commented Oct 9, 2018

@cmonr I'm the last person you want to get an OK from for this sort of thing. The incorrect ICSR value got uncovered while looking into a bug I filed, and then the bug was fixed, but the ICSR value issue was not addressed. After that, I simply filed a bug to make sure it would be.

@cmonr
Copy link
Contributor

cmonr commented Oct 9, 2018

@jrobeson Thanks for the honesty.

Then I'll just wait for someone else from @mbed-os-core before moving this PR along.

@deepikabhavnani
Copy link

Then I'll just wait for someone else from @mbed-os-core before moving this PR along.

@SenRamakri - Please have a look

@TacoGrandeTX
Copy link
Contributor Author

@cmonr There is no pressing need to get this released. I just completed local tests of adding the DSB instruction as suggested by @kjbracey-arm and have pushed that commit.

@kjbracey
Copy link
Contributor

Aside from my wittering about other larger approaches, I do agree with @c1728p9 that in this version the "if ICSR" check just shouldn't be there.

The "pending" check isn't needed - it's inherently racy, and the WFE does it for you.

The "active" check I don't really understand - why do you feel you need to be defensive against people calling hal_sleep inside an exception handler?

It's a weird thing to do, but not especially dangerous for this particular platform, is it? It's a valid request - "please block until a higher-priority interrupt would wake me".

Other platforms just do "SEV; WFE; WFE;", and I don't see why you shouldn't. (And only one DSB is needed, not one per WFE).

@TacoGrandeTX
Copy link
Contributor Author

@kjbracey-arm That's a good point so there is no point in being defensive here. I made the most conservative change that fixed the problem but didn't think we could exclude the check. As you and @c1728p9 think the ICSR check is not needed, I'll remove it and the second DSB.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants