Skip to content

RBMC: Only allow failovers after full sync #99

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Jun 24, 2025
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 2 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -24,8 +24,8 @@ state to/by the end user.

## State Tracking and Control

phosphor-state-manager makes extensive use of systemd. There is a writeup
[here][1] with an overview of systemd and its use by OpenBMC.
phosphor-state-manager makes extensive use of systemd. There is a [writeup][1]
with an overview of systemd and its use by OpenBMC.

phosphor-state-manager monitors for systemd targets to complete as a trigger to
updating the its corresponding D-Bus property. When using PSM, a user must
Expand Down
4 changes: 3 additions & 1 deletion redundant-bmc/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -174,7 +174,9 @@ Failovers aren't allowed when:

1. Redundancy is disabled.
2. The system is at some state other than off or runtime.
3. More coming.
3. `RedundancyEnabled` has changed to true but a full sync hasn't been
completed.
4. More coming.

When failovers aren't allowed, rbmctool can be used to display the reasons why.

Expand Down
8 changes: 8 additions & 0 deletions redundant-bmc/src/redundancy.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -141,6 +141,11 @@ FailoversNotAllowedReasons getFailoversNotAllowedReasons(const Input& input)
return reasons;
}

if (!input.fullSyncComplete)
{
reasons.insert(fullSyncNotComplete);
}

if ((input.systemState != SystemState::off) &&
(input.systemState != SystemState::runtime))
{
Expand All @@ -161,6 +166,9 @@ std::string getFailoversNotAllowedDescription(FailoversNotAllowedReason reason)
case systemState:
desc = "System state is not off or runtime"s;
break;
case fullSyncNotComplete:
desc = "A full sync hasn't been completed"s;
break;
case redundancyDisabled:
desc = "Redundancy is disabled"s;
break;
Expand Down
2 changes: 2 additions & 0 deletions redundant-bmc/src/redundancy.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -84,6 +84,7 @@ struct Input
{
// TODO: Add more
bool redundancyEnabled;
bool fullSyncComplete;
SystemState systemState;
};

Expand All @@ -93,6 +94,7 @@ struct Input
enum class FailoversNotAllowedReason
{
redundancyDisabled,
fullSyncNotComplete,
systemState
};

Expand Down
8 changes: 8 additions & 0 deletions redundant-bmc/src/redundancy_mgr.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -51,6 +51,8 @@ void RedundancyMgr::determineAndSetRedundancy()
{
// Make sure syncs are disabled if redundancy is disabled.
ctx.spawn(providers.getSyncInterface().disableBackgroundSync());

providers.getSyncInterface().clearFullSyncComplete();
}
}

Expand Down Expand Up @@ -82,6 +84,11 @@ sdbusplus::async::task<> RedundancyMgr::determineRedundancyAndSync()
determineAndSetRedundancy();
syncFailed = false;
}
else
{
// Full sync is done so recalculate FailoversAllowed
determineAndSetFailoversAllowed();
}
}
}

Expand Down Expand Up @@ -283,6 +290,7 @@ void RedundancyMgr::determineAndSetFailoversAllowed()
{
fona::Input input{
.redundancyEnabled = redundancyInterface.redundancy_enabled(),
.fullSyncComplete = providers.getSyncInterface().isFullSyncComplete(),
.systemState = systemState.value_or(SystemState::other)};

auto notAllowedReasons = fona::getFailoversNotAllowedReasons(input);
Expand Down
23 changes: 23 additions & 0 deletions redundant-bmc/src/sync_interface.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -45,6 +45,23 @@ class SyncInterface
return fullSyncInProgress;
}

/**
* @brief Says if a full sync has been completed since
* redundancy was most recently enabled.
*/
inline bool isFullSyncComplete() const
{
return fullSyncComplete;
}

/**
* @brief Clears the full sync complete indication
*/
inline void clearFullSyncComplete()
{
fullSyncComplete = false;
}

using SyncHealthCallback =
std::function<void(SyncBMCData::SyncEventsHealth)>;

Expand Down Expand Up @@ -79,6 +96,12 @@ class SyncInterface
*/
bool fullSyncInProgress = false;

/**
* @brief If a full sync has successfully been completed
* this go around of redundancy being enabled.
*/
bool fullSyncComplete = false;

/**
* @brief The health status callback functions
*/
Expand Down
5 changes: 4 additions & 1 deletion redundant-bmc/src/sync_interface_impl.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -19,6 +19,8 @@ sdbusplus::async::task<bool> SyncInterfaceImpl::doFullSync()

try
{
fullSyncComplete = false;

co_await lookupService();

auto sync = SyncBMCData(ctx)
Expand Down Expand Up @@ -74,7 +76,8 @@ sdbusplus::async::task<bool> SyncInterfaceImpl::doFullSync()
lg2::info("Full sync completed with status {STATUS}", "STATUS", status);

fullSyncInProgress = false;
co_return status == SyncStatus::FullSyncCompleted;
fullSyncComplete = status == SyncStatus::FullSyncCompleted;
co_return fullSyncComplete;
}

// NOLINTNEXTLINE
Expand Down
16 changes: 15 additions & 1 deletion redundant-bmc/test/redundancy_test.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -163,7 +163,9 @@ TEST(RedundancyTest, FailoversNotAllowedTest)

for (const auto& [state, expectedReasons] : testStates)
{
fona::Input input{.redundancyEnabled = true, .systemState = state};
fona::Input input{.redundancyEnabled = true,
.fullSyncComplete = true,
.systemState = state};

auto reasons = fona::getFailoversNotAllowedReasons(input);
EXPECT_EQ(reasons, expectedReasons);
Expand All @@ -172,12 +174,24 @@ TEST(RedundancyTest, FailoversNotAllowedTest)
// Redundancy disabled
{
fona::Input input{.redundancyEnabled = false,
.fullSyncComplete = true,
.systemState = rbmc::SystemState::off};
auto reasons = fona::getFailoversNotAllowedReasons(input);
ASSERT_EQ(reasons.size(), 1);
EXPECT_EQ(*reasons.begin(),
fona::FailoversNotAllowedReason::redundancyDisabled);
}

// Full sync not complete
{
fona::Input input{.redundancyEnabled = true,
.fullSyncComplete = false,
.systemState = rbmc::SystemState::off};
auto reasons = fona::getFailoversNotAllowedReasons(input);
ASSERT_EQ(reasons.size(), 1);
EXPECT_EQ(*reasons.begin(),
fona::FailoversNotAllowedReason::fullSyncNotComplete);
}
}

TEST(RedundancyTest, GetFailoversNotAllowedDescTest)
Expand Down