Skip to content

Conversation

@fschrempf
Copy link
Contributor

@fschrempf fschrempf commented Dec 17, 2025

In order to reduce the power consumption of NRF52 repeaters, this implements CPU idling by suspending the main task that runs the Arduino loop().

During the idle interval the CLI and the UI (if available) will be unresponsive. The RF module will be kept in RX mode and upon receiving a packet, the interrupt will cause the processing loop to continue immediately before going back to idle after all outgoing packets have been transferred.

On a RAK4631 repeater this can reduce the power consumption during RX mode from around 12 mA to around 7.5 mA.

Latest build artifacts from CI (updated 01/13/26): https://github.com/fschrempf/MeshCore/actions/runs/20971258478/artifacts/5118537314

Feedback, tests, reviews and questions welcome!

@IoTThinks
Copy link

I intended to do the same before.
But it is quite hard to power down NRF52 and be waken up by RX events. The power consumption will be around 6mA.
NRF52 does not handle Rising High well.
Your PR looks complex. May be due to this issue.

So I ended up just do some power saving trick to keep the power down to 8.5mA with few code changes only.
The MCU is still running.

Hope we can push the power down more for NRF52

@fschrempf
Copy link
Contributor Author

fschrempf commented Dec 17, 2025

But it is quite hard to power down NRF52 and be waken up by RX events.

Yes, there is no real sleep mode for NRF52 so either put the CPU in idle to save power or shut it down completely. The latter requires full reinit at wakeup.

The power consumption will be around 6mA.

In which case? With the MCU shut down and only the RF module running in RX mode? That would be in that ballpark, yes.

NRF52 does not handle Rising High well.

Sorry, I don't get what you mean here.

Your PR looks complex. May be due to this issue.

Actually it doesn't look complex to me at all. It's pretty straight forward. Instead of letting the CPU run continuously, it stops it and resumes it after the idle interval is over or a packet is received. What exactly looks complex to you?

So I ended up just do some power saving trick to keep the power down to 8.5mA with few code changes only.

What "trick" would that be? Do you mean waitForEvent() here? It doesn't work for me. I'm still seeing around 14 mA @ 3.3V with your PowerSaving07 branch.

And if it would work, it would do the same thing: put the CPU in idle, right? My approach does it in a platform-agnostic way, which is better IMHO.

@ngavars
Copy link
Contributor

ngavars commented Dec 17, 2025

This feature has been long overdue. I tested it a bit with Heltec t114 (no screen) repeater. With stock FW I am seeing around 12 mA idle current on my cheap usb power meter. It is actually quite good - it used to be around 17 mA with stock FW not so long ago.

Then I flashed FW from this branch and immediately went for "set idle.interval 10000". Now my cheapo usb meter shows current at 0 mA, which occasionally jumps briefly to 10 mA while idling. If I send a message from my companion then the repeater wakes, repeats and then goes back to idling. Now, the 0mA is probably just an artifact of my usb meter. I will try some additional tests tomorrow, but what I see so far - the power consumption at idle is noticeably lower and repeater still seems to be repeating messages and responding to admin commands etc

What is the intended reasonable idle interval? Something like 1 to 3 seconds?

@IoTThinks
Copy link

May be I will add the CLI for esp32 based boards.

Should we share the same CLI to enable/disable power saving to both esp32 and nrf52?

Like set powersave 1?

In your PR, you set the idle period? How about wake up period?

@IoTThinks
Copy link

@ngavars You have to measure the current at the battery cable by power meter.

A usbc, it needs to turn on unneccessary components like uart chip, led...

@fschrempf
Copy link
Contributor Author

May be I will add the CLI for esp32 based boards.
Should we share the same CLI to enable/disable power saving to both esp32 and nrf52?

In my opinion we should use the same approach for ESP32 and NRF52 altogether. You should be able to take my generic implementation and simply add the board.sleep() implementation for ESP32 to be called before the loop is halted.

Like set powersave 1?

What would that be good for? A single parameter that sets the length of the sleep/idle interval is enough. If set to zero (default) there is no change compared to the current implementation in the main branch. Repeater admins can then decide if they want to save power by increasing the interval.

In your PR, you set the idle period? How about wake up period?

The idle interval is the time the CPU is idling/sleeping. The wake interval is hardcoded to five seconds or three minutes after reboot or after CLI activity. I don't think this needs to be a parameter for the user to change.

@fschrempf
Copy link
Contributor Author

What is the intended reasonable idle interval? Something like 1 to 3 seconds?

The idle.interval setting is in seconds, so 10000 will give you around 2.8 hours. I'm currently running my test repeater with 1800 (30 minutes), but I'm not yet sure what would be a good value. If you have the value set very high, the automatic adverts of the repeater might be delayed accordingly (until the next wakeup happens). Apart from that I currently don't see any other downsides.

@fschrempf
Copy link
Contributor Author

@ngavars Thanks for testing by the way!

Copy link

@mtlynch mtlynch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm just a user sharing my thoughts and have no authority in this project.

@fschrempf
Copy link
Contributor Author

fschrempf commented Dec 19, 2025

I'm just a user sharing my thoughts and have no authority in this project.

Thanks for your review anyway! Very much appreciated!

@fschrempf fschrempf marked this pull request as ready for review December 19, 2025 09:31
@fschrempf
Copy link
Contributor Author

I've had this running on my test repeater (RAK4630) for 3 days now and I can't see any negative or unexpected results. I currently do not have any ESP32 hardware to test, so if someone would be willing to help out that would be very much appreciated! 😃

@SaschaKt
Copy link

I've had this running on my test repeater (RAK4630) for 3 days now and I can't see any negative or unexpected results. I currently do not have any ESP32 hardware to test, so if someone would be willing to help out that would be very much appreciated! 😃

Can you attach/upload somewhere an bin file for v3 for testing?

@fschrempf
Copy link
Contributor Author

fschrempf commented Dec 22, 2025

@SaschaKt You can find an archive with all repeater firmwares built by the GitHub CI here: https://github.com/fschrempf/MeshCore/actions/runs/20347701758/artifacts/4916433935

@SaschaKt
Copy link

SaschaKt commented Dec 22, 2025

@SaschaKt You can find an archive with all repeater firmwares built by the GitHub CI here: https://github.com/fschrempf/MeshCore/actions/runs/20347701758/artifacts/4916433935

I flashed it. Sorry for feedback after a few seconds. for v3 the powersaving is not so efficient like Iotthinks solution. I'm measuring over Usb-C and the consumption goes down from 48mA to 39mA with your solution, with Iotthinks solution it goes down to 13mA measured over Usb-C. Measuring on battery input pins will be a few mA less. I don't think that the combination of both solutions will give more power saving because I think that Lightsleep at esp's with iotthink solution also reduce CPU activity

@fschrempf
Copy link
Contributor Author

@SaschaKt Thanks for testing! This is the expected behavior. This PR does not (yet) include the code for ESP32 sleep. And yes, the combination of both solutions won't give more power savings for ESP32 than #1107.

The reason for my PR is that it provides a generic approach that is also applicable for other platforms than ESP32 while still providing the possibility to be extended by platform-specific sleep.

If your test shows a slightly reduced power consumption and the repeater still works fine that's a success.

@SaschaKt
Copy link

@SaschaKt Thanks for testing! This is the expected behavior. This PR does not (yet) include the code for ESP32 sleep. And yes, the combination of both solutions won't give more power savings for ESP32 than #1107.

The reason for my PR is that it provides a generic approach that is also applicable for other platforms than ESP32 while still providing the possibility to be extended by platform-specific sleep.

If your test shows a slightly reduced power consumption and the repeater still works fine that's a success.

It works, I could go normal to cli and start OTA and flash with Iotthink variant again. Now again 13mA. 39mA is still too much. nrF have per default enabled power saving internally and are at the small consumption level, with tweaking maybe a few mA savings. But esp's needs an enabled lightsleep to be useful as a repeater. The advantage of an nrF is that still with enabled BT the power consumption is not noticeable higher then without. Where is the problem to have two power routines, one for esp's and one for NRFs? Also Iotthink made powersavings for nrF too which will reduce from 12.5 to 8.5mA, that's about 32%

@fschrempf
Copy link
Contributor Author

Where is the problem to have two power routines, one for esp's and one for NRFs

@SaschaKt I think we are talking past each other. There is no problem here. What I want to achieve is a two step solution:

  1. A platform-independent solution (with some power savings) with idling the CPU (without any sleep) that works in all cases and also for any future boards and platforms.
  2. Additionally platform-specific sleep functions (with more power savings) if available.

Apart from the differences mentioned in #1107 (review) this is purely a strategic difference from what #1107 does.

@SaschaKt
Copy link

Where is the problem to have two power routines, one for esp's and one for NRFs

@SaschaKt I think we are talking past each other. There is no problem here. What I want to achieve is a two step solution:

  1. A platform-independent solution (with some power savings) with idling the CPU (without any sleep) that works in all cases and also for any future boards and platforms.
  2. Additionally platform-specific sleep functions (with more power savings) if available.

Apart from the differences mentioned in #1107 (review) this is purely a strategic difference from what #1107 does.

You're right. I can confirm that with v3 the powerconsumption goes from 48mA to 39mA. That's an improvement..I hope that it would not interfere with hardware specific power saving ontop like iotthink has made..this has to be tested

@ngavars
Copy link
Contributor

ngavars commented Dec 23, 2025

I did some additional testing with power profiler and Heltec V3 board on stock firmware and on your firmware. Both measurements were captured after the initial boot/advert sequence, when the board starts idling. Heltec was powered through its battery connector with voltage set at 3307 mV.

With stock firmware the avg current is 45 mA.
With power option firmware the avg current is also around 45 mA until approx. 3 minute mark, when it drops down to around 35 mA.

With stock firmware (latest from web flasher):
image

With power option firmware, idle.interval=1800:
image

Hope this helps. I can also test it with Promicro board if you want.

Copy link

@4np 4np left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for working on this @fschrempf 👍 It would be good to see some more power saving for my Solar RAK19003 node :)

Having said that, I wonder if if would be better if the idle interval would be computed automatically rather than using a fixed duration. For example, in The Netherlands the mesh has grown exponentially. Where 2 months ago the mesh saw some activity, now I see hundreds of daily messages (let alone adverts) and (some parts of) Belgium and Germany have been connected. I wouldn't be surprised to see more of Germany and France starting to appear. An idle interval that worked a couple of weeks ago (maybe even days ago) may not work as well today.

Additionally, if CPU idling is triggered, how would one be able to enter remote management mode? IMHO when trying to use remote management or fetching telemetry, idling should be cancelled and the idle timer should be reset.

_prefs.advert_loc_policy = ADVERT_LOC_PREFS;

_prefs.adc_multiplier = 0.0f; // 0.0f means use default board multiplier
_prefs.idle_interval = 0;
Copy link

@4np 4np Dec 30, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps add a comment to better explain what this means:

Suggested change
_prefs.idle_interval = 0;
// CPU Idle Interval (in seconds).
//
// If enabled (> 0), the CPU will start idling if there has not been any radio activity for
// the specified number of seconds.
//
// Note: When the CPU is idling, remote management and / or UI will cease to function!
_prefs.idle_interval = 0;

Or shorter:

Suggested change
_prefs.idle_interval = 0;
_prefs.idle_interval = 0; // CPU will idle when no RF activity during this interval (in sec, 0 = disabled)

@ngavars
Copy link
Contributor

ngavars commented Dec 30, 2025

Additionally, if CPU idling is triggered, how would one be able to enter remote management mode? IMHO when trying to use remote management or fetching telemetry, idling should be cancelled and the idle timer should be reset.

It will wake up on any RX event. Remote management works nicely, at least in all my tests so far.

@ngavars
Copy link
Contributor

ngavars commented Dec 30, 2025

I did some more testing with Xiao NRF52840. Here's a comparison of stock firmware (dev branch) and power option firmware (idle.interval=1800). In both cases the board is idling for 1 minute and there are no radio events.

Stock FW
image

idle.interval=1800
image

@fschrempf
Copy link
Contributor Author

An idle interval that worked a couple of weeks ago (maybe even days ago) may not work as well today.

@4np How is the idle interval related to the mesh activity? During idle the repeater will still respond to incoming packets. Only outgoing packets will be left waiting until the timeout expires.

@fschrempf
Copy link
Contributor Author

I did some more testing with Xiao NRF52840.

Thanks for the additional tests. This corresponds pretty nicely to the results of my own tests on RAK4631.

@fschrempf
Copy link
Contributor Author

fschrempf commented Jan 9, 2026

@4np One more thing: Do you use any AI tools for you reviews and code snippets? If yes, please make sure that you will mention this in the future. I want to know when I'm dealing with feedback generated by a LLM. If this is not the case, just ignore my comment. Thanks!

@fschrempf fschrempf marked this pull request as ready for review January 9, 2026 20:44
@4np
Copy link

4np commented Jan 10, 2026

@fschrempf, actually, I wrote these. The one with the bullet points I wrote, but I did ask ChatGPT to rewrite what I wrote and summarize it more consistently.

With these fixes, I don't think it is necessary to explicitly check the DIO1 pin or introduce any other workarounds.

It could indeed very well be that the many timers was the cause of the unresponsiveness. Good to hear you have good results :)

I'll give the new firmware build a go.

@fschrempf
Copy link
Contributor Author

@4np Thanks for clarifying!

@4np
Copy link

4np commented Jan 11, 2026

@fschrempf I am sorry to report that my RAK19007 / RAK4631 is again unresponsive so it looks like your latest changes did not solve the issue.

Battery was at 99% when I flashed the firmware yesterday evening, this morning it's again unresponsive.

@fschrempf
Copy link
Contributor Author

@4np Thanks for testing! That's strange. On my end with the old version the problem occurred after a few hours or a day at the latest. With the new version the repeater has worked fine for four days in a row now.

I will try to do some further investigations.

@4np
Copy link

4np commented Jan 11, 2026

@fschrempf I applied a different implementation checking for DIO1 == LOW, and for now I only implemented it for the RAK4631 variant. I am testing it at the moment on my repeater to see if it fixes the unresponsiveness for me. This is the patch with the changes I made.

Note: this is the RAK4631 repeater build: RAK_4631_repeater-adhoc-202601111311-firmware.zip

@4np
Copy link

4np commented Jan 12, 2026

@fschrempf: So far so good, 19h uptime without stalls... It looks like sleeping when DIO1 is HIGH is the cause of the unresponsiveness.

As this is related to SX126x, I wonder if the same deadlock may happen in the ESP32 implementation or if it's unique to NRF52?

With the new version the repeater has worked fine for four days in a row now.

The mesh here is pretty crowded and busy, maybe it's more quiet on your end?

There is no reason to not use the reset pin as the RAK4630/31 module
has it connected internally.

Signed-off-by: Frieder Schrempf <frieder@fris.de>
This makes the code easier to read and allows for easier changing of
the hardcoded values.

Signed-off-by: Frieder Schrempf <frieder@fris.de>
When a CLI command is issued through the serial interface, extend the
timeout for going to sleep to give the user more time for issuing more
commands.

Signed-off-by: Frieder Schrempf <frieder@fris.de>
This can be used to prevent sleeping during critical tasks like OTA
update.

Signed-off-by: Frieder Schrempf <frieder@fris.de>
In order to not interrupt the OTA update, prevent any sleep requests
after the "start ota" command has been activated.

Signed-off-by: Frieder Schrempf <frieder@fris.de>
If the radio driver state machine is not in receive mode, this means
that processing of a packet is still in progress and we are not in an
idle state.

Signed-off-by: Frieder Schrempf <frieder@fris.de>
@fschrempf
Copy link
Contributor Author

fschrempf commented Jan 12, 2026

So I've been digging into the deadlock issue a bit more and got a pretty clear image of what is happening and how it can be prevented.

1. suspendLoop() and resumeLoop() in NRF52 Arduino Core

The code uses the FreeRTOS task API to suspend and resume the task that runs the main loop.

void suspendLoop(void)
{
  vTaskSuspend(_loopHandle);
}

void resumeLoop(void)
{
  if ( isInISR() ) 
  {
    xTaskResumeFromISR(_loopHandle);
  } 
  else
  {
    vTaskResume(_loopHandle);
  }
}

This works fine in general, but brings in some issues, especially the one mentioned in the docs:

xTaskResumeFromISR() is generally considered a dangerous function because its actions are not latched. For this reason it should definitely not be used to synchronise a task with an interrupt if there is a chance that the interrupt could arrive prior to the task being suspended, and therefore the interrupt being lost.

2. Triggering the deadlock

When going to sleep we call suspendLoop() from within the loop task that gets suspended. On the other hand the resumeLoop() is fully asynchronous called by the ISRs. Calling resumeLoop() when the task is not yet or currently in the process of being suspended apparently causes the lockup. Unfortunately we can't make it 100% safe unless we disable the interrupts for a certain amount of time which risks missing incoming packets. Trying to sync the suspend in the loop and the resume in the ISR always bears the risk of leaving a small time window in which the triggering of the ISR would cause a lockup.

3. Solution using FreeRTOS API

We maybe could guard the resumeLoop() by a check of the loop task being actually suspended. As that would require to use the FreeRTOS API anyway, it seems much better to go for a more elegant solution that is also recommended by the FreeRTOS docs. Using a direct task notification is efficient and fast according to the docs.

Edit: Using a direct task notification is not possible as we need the task handle of the loop task which is declared static in the core and the API function (xTaskGetHandle()) is disabled in FreeRTOSConfig.h. A good alternative is a binary semaphore.

It works like this:

  1. Initializing a binary semaphore and "taking" it once in the setup code
  2. Blocking the loop task by taking the semaphore a second time in sleep()
  3. Resuming the loop by "giving" the semaphore from the radio ISR
  4. We don't need a wakeup timer as we can use the timeout parameter in xSemaphoreTake()

Yes, we use the FreeRTOS API directly instead of the higher level Arduino core functions. But I think that's acceptable as it should provide a reliable and working solution.

The ISR will be used for other purposes than just setting a flag,
rename it accordingly.

Signed-off-by: Frieder Schrempf <frieder@fris.de>
STATE_TX_WAIT is meant to be a single bit but is defined as two bits,
therefore overlapping with STATE_RX. This causes incorrect results
when the code uses the bits to handle the state machine. Fix this.

Signed-off-by: Frieder Schrempf <frieder@fris.de>
In some cases it is useful to let the board driver know when the radio
issues an RX interrupt. This can be used to wake up the board from a
low power state asynchronously for example.

Signed-off-by: Frieder Schrempf <frieder@fris.de>
@fschrempf
Copy link
Contributor Author

fschrempf commented Jan 12, 2026

I rebased the PR and implemented the following changes compared to the previous version:

  1. Fix a bug in RadioLibWrappers.cpp where the bitmask of STATE_TX_WAIT was wrong which led to onRXInterrupt() being also triggered on TX ready interrupts (see 72a7335).
  2. Use direct task notifications for suspending and resuming the loop task as explained above (see 56213ca).

I noticed that for some reason the serial debug messages on the USB CDC interface of the RAK4631 (using TinyUSB library) stop after the the first suspend/resume cycle. I couldn't find the reason for this yet. If anyone got an idea or could check if they see the same problem that would be appreciated.

Here is a link to the CI build: https://github.com/fschrempf/MeshCore/actions/runs/20933125598/artifacts/5103765633

@4np Sorry for proposing yet another solution. I would have been fine with adding the DIO1 == LOW check as a fix but after trying to understand the issue I think this still has the potential of locking up, although the probability is likely lower than without the check.

@4np
Copy link

4np commented Jan 13, 2026

@fschrempf sure no problem, I think your investigation makes sense. I'll give the new build a go and see how it behaves on my end.

In any case, my repeater has been up for 1 day and 18 hours without locking up with the DIO1 changes.

@4np
Copy link

4np commented Jan 13, 2026

@fschrempf your CI build does not contain the RAK4631 repeater, it failed to build (as did the other NRF52 variants):

.pio/build/RAK_4631_repeater/src/helpers/NRF52Board.cpp.o: In function `NRF52Board::begin()':
NRF52Board.cpp:(.text._ZN10NRF52Board5beginEv+0xa): undefined reference to `xTaskGetHandle'
.pio/build/RAK_4631_repeater/src/helpers/NRF52Board.cpp.o: In function `NRF52BoardDCDC::begin()':
NRF52Board.cpp:(.text._ZN14NRF52BoardDCDC5beginEv+0x14): undefined reference to `xTaskGetHandle'
.pio/build/RAK_4631_repeater/src/helpers/NRF52Board.cpp.o: In function `virtual thunk to NRF52BoardDCDC::begin()':
NRF52Board.cpp:(.text._ZTv0_n80_N14NRF52BoardDCDC5beginEv+0x1c): undefined reference to `xTaskGetHandle'
collect2: error: ld returned 1 exit status
*** [.pio/build/RAK_4631_repeater/firmware.elf] Error 1
========================= [FAILED] Took 54.80 seconds =========================

Environment        Status    Duration
-----------------  --------  ------------
RAK_4631_repeater  FAILED    00:00:54.799
==================== 1 failed, 0 succeeded in 00:00:54.799 ====================

This uses a FreeRTOS semaphore to block the main loop task and resume
it on RX radio IRQ or timeout.

Using the Arduino core functions suspendLoop() and resumeLoop() can
lead to deadlocks which can only be avoided by disabling IRQs and
possibly missing incoming packets.

Signed-off-by: Frieder Schrempf <frieder@fris.de>
@fschrempf
Copy link
Contributor Author

fschrempf commented Jan 13, 2026

@fschrempf your CI build does not contain the RAK4631 repeater, it failed to build (as did the other NRF52 variants):

Oh dear! I didn't notice because the job status is green. And locally my build worked because I had modified my Arduino core manually some weeks ago when I was doing some experiments. Therefore my local FreeRTOSConfig.h contained #define INCLUDE_xTaskGetHandle 1 whereas the official version doesn't have xTaskGetHandle and so this approach won't work.

Therefore I now reverted to the original solution (used in the very first version of this PR, sometimes the first ideas are the best ;)) using a FreeRTOS binary semaphore. This is equally good and simple and also a recommended solution in the FreeRTOS docs.

Sorry for the additional noise. I updated the explanatory post above accordingly.

Here are the latest binaries from the CI build: https://github.com/fschrempf/MeshCore/actions/runs/20971258478/artifacts/5118537314

@4np
Copy link

4np commented Jan 14, 2026

I just flashed RAK_4631_repeater-opt-1ae1d7b, fingers crossed! 🤞🏻 ;)

@fschrempf
Copy link
Contributor Author

@4np Ok, but don't spend too much efforts in testing. Currently it looks like we will go for @IoTThinks alternative solution in #1353.

@4np
Copy link

4np commented Jan 14, 2026

Maybe I should test #1353 instead then 😄

@SaschaKt
Copy link

Maybe I should test #1353 instead then 😄

From 8mA to 4mA, very good, I'm using it from the first version primary with v3/v4 (48mA down to 9-10mA) as solar repeater

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants