-
Notifications
You must be signed in to change notification settings - Fork 9
Command Loss Time Does not propogate across boots #297
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Mikefly123
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think these changes look good to me, but calling this new boolean first_boot is confusing. In my mind it implies that this is somehow a persistent variable for the first time the satellite ever boots (after being deployed), but really it just means the first time this component is called after a boot up.
Consider changing the variable name to something like in_startup or cold_start.
I would also like to note that although this behavior helps a lot with making it so we don't instantly trigger command loss after a being turned off for a few days, we still probably want to delay the initial setup of command loss file be a minute or two after startup to protect a little "handling time" when we are doing integration with the dispenser.
| // (e.g., to reset radio parameters and enforce any transmit delay policy) | ||
| this->log_WARNING_HI_UnintendedRebootDetected(); | ||
| this->enterSafeMode(Components::SafeModeReason::SYSTEM_FAULT); | ||
| this->runSafeModeSequence(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I this we have a potential issue here, where calling the SafeModeSequence on an unintended reboot will cause the safe mode sequencer to conflict with the primary command sequencer (which may be running the startup sequence with almost exactly the same timing as the radio_enter_safe.seq
Mikefly123
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! Good to see the simplification of the logic here
Command Loss Write only on first boot
Command Loss Timer Simplification and Safe Mode Boot Improvements
Description
This PR simplifies the command loss timer implementation and ensures that boot loops and fatal errors reliably boot into safe mode with proper radio configuration and load switch management.
Key Changes
Removed file persistence for command loss timer: The command loss timer no longer saves the last command time to a file. The timer is now in-memory only, resetting on each boot. This simplifies the code and avoids filesystem issues during boot loops.
Safe mode sequence runs on boot: When the system boots into safe mode (either from persistent state or unintended reboot), the safe mode sequence (
enter_safe.seq) now runs automatically to:Watchdog stop behavior split:
FatalHandlerorAuthenticationRoutercommand loss): Do NOT callprepareForReboot, ensuring the next boot is classified as unintended and boots into safe modeSTOP_WATCHDOG: Still callsprepareForReboot, marking the reboot as intentionalUnintended reboot path enhancement: When an unintended reboot is detected (boot loop, fatal error, watchdog timeout), the system now:
SYSTEM_FAULTreasonRationale
Previously, if the system was stuck in an assert/boot loop, the command loss timer file persistence could cause issues. The new approach relies on:
This ensures that boot loops will stop the watchdog, reboot, and automatically enter safe mode with known-good radio parameters and load switches off, preventing transmission before the 45-minute delay period.
Related Issues/Tickets
Helps with #304 and #287
How Has This Been Tested?
Screenshots / Recordings (if applicable)
Helps with #304
Further Notes / Considerations
Load switch redundancy: ModeManager still calls
turnOffNonCriticalComponents()when entering safe mode (via ports), and the sequence also sendsTURN_OFFcommands. Both mechanisms run, ensuring redundancy but potentially causing duplicate commands. Consider removing port-based turn-off if sequence-based approach is preferred.Command loss timer reset: With file persistence removed, the command loss timer resets on every boot. This means "command loss with boots in between" scenarios will restart the timer from zero on each boot, rather than accumulating across boots. This is intentional simplification.
45-minute delay: The safe mode sequence enforces a 45-minute delay before enabling LoRa transmit. This ensures that if the system boots into safe mode at startup, it will not transmit before the required quiescence period.
Checklist
Further Notes / Considerations