Skip to content

Conversation

@mutatrum
Copy link
Collaborator

@mutatrum mutatrum commented Sep 2, 2025

Fixes #1210

This changes quite a lot. Two tasks are extracted from power management: a fan controller task and an asic management task for frequency and voltage changes. This stops frequency ramps - including the initial one at startup - to no longer block the power management task, including overheat checks.

The startup sequence is materially changed by this PR. The initial frequency ramp is done after the asic has been completely initialised and the fan controller is already running. This means that the full fan blast at startup is almost completely eliminated.

This PR also increases the frequency of fan controller to 10/sec and tuned the PID to be a lot less aggressive, and removing the startup sequence. These changes might be better done as a separate PR, depending on testing, but on my device the controller is oscillating a lot less.

🚨 This works on my Gamma, but needs a lot more testing, especially from a cold start. Proceed at your own risk! 🚨

Future improvements for this PR are to simplify the ASIC management task by adjusting a single frequency step on each tick, instead of delegating it to do_frequency_ramp.

It also needs more code cleanups, I'm not satisfied with the header file structure, especially the PowerManagementModule struct and how this is included throughout the application via global_state.h.

One open problem currently (also without this PR) is that changing the frequency through the BAP port will block the BAP handler as it's doing a blocking call to do_frequency_transition. The BAP handler should just set a new desired frequency, and the asic management calls the set_frequency function.

@github-actions
Copy link

github-actions bot commented Sep 2, 2025

Test Results

20 tests  ±0   20 ✅ ±0   0s ⏱️ ±0s
 1 suites ±0    0 💤 ±0 
 1 files   ±0    0 ❌ ±0 

Results for commit dc0d798. ± Comparison against base commit fa0409e.

♻️ This comment has been updated with latest results.

@mutatrum mutatrum changed the title Initial proof of concept. Split power management task into multiple tasks Sep 3, 2025
xTaskCreate(create_jobs_task, "stratum miner", 3072, (void *) &GLOBAL_STATE, 10, NULL);
xTaskCreate(ASIC_task, "asic", 2048, (void *) &GLOBAL_STATE, 10, NULL);
xTaskCreate(ASIC_result_task, "asic result", 4096, (void *) &GLOBAL_STATE, 15, NULL);
xTaskCreate(statistics_task, "statistics", 2048, (void *) &GLOBAL_STATE, 3, NULL);
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Stack size of a lot of tasks had to be reduced, we're running out of heap with that many tasks. Not sure how to make that bigger. It also needs to have some general error handling, the xTaskCreate fails silently if we run into this again.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that silent crash can get catched with a handler for every task, and then check after task create !handler

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good that you mentioned, I wanted to add this. I think it's a good thing do add anyways, as the results are very puzzling, depending on which tasks were actually started and which weren't.

I'm also looking at increasing the FreeRTOS heap space, as this is quite constrained with the number of tasks we have.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is interesting reading material: espressif/esp-idf#11216 (comment)

TLDR: xTaskCreate always uses internal memory. This is by design. When a task does a NVS or SPI Flash write, it disables cache which makes the task switcher unable to execute anything else, rebooting the system. So, for those tasks, the stack can't be in SPI ram. All tasks that write to NVS or Flash need to be internal.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

well that nvs stuff get used way to much. for that settings loading it would be enough to read it in main.c setup once and fill a struct with it or whatever. saving happen there only in the http_server.c except the highest diff. but that could get deligated to queue and saving run in a own singel run task

@mutatrum mutatrum added the cleanup Code cleanup label Sep 3, 2025
@mutatrum mutatrum changed the title Split power management task into multiple tasks Separate tasks for frequency ramps and fan controller Sep 4, 2025
@WantClue WantClue self-assigned this Sep 13, 2025
@WantClue WantClue added this to the 2.11.0 milestone Sep 13, 2025
@WantClue
Copy link
Collaborator

There are some conflicts now

@mutatrum
Copy link
Collaborator Author

There are some conflicts now

Conflicts resolved, but I'm still testing. Sometimes it goes straight to overheat mode when booting cold. Not sure what's happening there.

And I also want to simplify the frequency controller, by adjusting a single step on each iteration instead of doing a loop within an eternal task.

@skot
Copy link
Collaborator

skot commented Sep 13, 2025

Sometimes it goes straight to overheat mode when booting cold. Not sure what's happening there.

FWIW the ASIC thermal diode will not read correctly until ASIC core voltage is on.

@mutatrum mutatrum marked this pull request as draft September 14, 2025 08:46
@mutatrum
Copy link
Collaborator Author

FWIW the ASIC thermal diode will not read correctly until ASIC core voltage is on.

Responsibilities are indeed not properly separated between the tasks, so there's some things happening out of order. I've set the PR to draft for now. Same with overheat mode, that's being overridden because of bad separation in the current code.

@mutatrum mutatrum marked this pull request as ready for review September 21, 2025 21:34
@KillerInk
Copy link
Contributor

about heap

In general, external RAM will not be used as task stack memory. xTaskCreate() and similar functions will always allocate internal memory for stack and task TCBs.

The option CONFIG_FREERTOS_TASK_CREATE_ALLOW_EXT_MEM can be used to allow placing task stacks into external memory. In these cases xTaskCreateStatic() must be used to specify a task stack buffer allocated from external memory, otherwise task stacks will still be allocated from internal memory.

https://docs.espressif.com/projects/esp-idf/en/stable/esp32/api-guides/external-ram.html#external-ram-config-memory-map

@mutatrum
Copy link
Collaborator Author

mutatrum commented Sep 22, 2025

There's also xTaskCreateWithCaps, but as stated above, you can't put the stack heap in PSRAM if that task writes to NVS, which unfortunately, a lot of tasks do.

Currently, NVS is used also as a message bus between tasks. It might be better to use GlobalState for that, and have a single NVS writer task. That way all the other tasks can leverage PSRAM. Only other exception would be the OTA flash task, as that one also has to be in internal memory.

Copy link
Collaborator

@WantClue WantClue left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

concept ack

@mutatrum mutatrum marked this pull request as draft October 2, 2025 22:03
@mutatrum
Copy link
Collaborator Author

mutatrum commented Oct 2, 2025

As this adds more tasks and we're running out of internal memory, it's better to wait for #1245 and #1253. Furthermore, this needs another pass as the division between the tasks is not good yet.

@WantClue WantClue removed this from the 2.11.0 milestone Oct 6, 2025
@mutatrum mutatrum mentioned this pull request Nov 16, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

cleanup Code cleanup

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Frequency ramps block power_management_task

4 participants