Skip to content

Conversation

@FxKu
Copy link
Member

@FxKu FxKu commented Nov 23, 2021

Change 1: When attempting a switchover, try it only on a replica that has no replication lag. In case synchronous replication is enabled there will be a replica with sync_standby role which should be chosen instead for switchover.

Change 2: In case the switchover is not executed or failing, return an error instead of emitting a Warning. Before we were always continuing by replacing the master pod anyway - producing downtime. Now, the master keeps running on an outdated pod. But, the Pod will have the rolling-update flag and the operator will notice it on the next sync and try another failover. It can be, that the failover just took longer, e.g. because taking a CHECKPOINT took longer than Patroni API waits to respond to the user. In this ideal case only a replica would be recreated. In worst case, a switchover might be retried on each sync. But this should be fixed by the admin anyway.

fixes #1686
fixes #109
fixes #600

@FxKu FxKu added this to the 1.8 milestone Nov 23, 2021
@FxKu FxKu added the zalando label Nov 25, 2021
@Jan-M
Copy link
Member

Jan-M commented Dec 2, 2021

Sounds good, lets make it the one with least lag, not with 0 lag.

return spec.NamespacedName{Namespace: master.Namespace, Name: candidates[0].Name}, nil
}

return spec.NamespacedName{}, fmt.Errorf("no switchover candidate found")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why does it return the empty spec.NamespacedName{} and not nil in case of error ?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Go says: "cannot use nil (untyped nil value) as spec.NamespacedName value in return statement" 😃

Replica PostgresRole = "replica"
Leader PostgresRole = "leader"
StandbyLeader PostgresRole = "standby_leader"
SyncStandby PostgresRole = "sync_standby"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel there is a comment missing here:

  1. how does operator treat differently Master, StandbyLeader on the one hand and Leader on the other ? Both Master and StandbyLeader are leaders.
  2. SyncStanbdy is still a Replica

While it is still possible to figure it out from the code, a short comment about why these are not completely independent states is worthy.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

master and replica are Spilo roles. The other ones role names returned by Patroni's cluster endpoint. Added comments

Role string `json:"role"`
State string `json:"state"`
Timeline int `json:"timeline"`
LagInMB int `json:"lag"`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does Patroni always return lag in MB and not simply in bytes ?
for example, the patronictl converts bytes to MB

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

renamed to just Lag

Copy link
Member Author

@FxKu FxKu Dec 10, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hm maybe lag is reserved for something else:

cannot unmarshal string into Go struct field ClusterMember.members.lag of type int

renamed it back, since I think it's always in MB.

@FxKu
Copy link
Member Author

FxKu commented Dec 13, 2021

👍

1 similar comment
@Jan-M
Copy link
Member

Jan-M commented Dec 14, 2021

👍

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Switchover fails randomly in synchronous replication Improve failover during rolling updates Manual failover: improve health check of replica

4 participants