Skip to content

Override _id of replicaSet in automation config (complementary to additionalMongodConfig replication.replSetName) #1650

Closed
@1st8

Description

@1st8

What did you do to encounter the bug?

Steps to reproduce the behavior:

  1. Have a legacy (deployed without the operator) MongoDB cluster with--replSet=rs0 in a StatefulSet named mongodb
  2. Migrate the cluster nodes to the operator using:
metadata:
  name: mongodb
spec:
  # ...
  additionalMongodConfig:
    replication.replSetName: rs0

What did you expect?

I expected the agent to receive rs0 as _id of the replicaSet when additionalMongodConfig replication.replSetName is used.

Alternatively there could be an override for the replicaSetName under spec directly, so it is used both for the replicaSets._id (agent) and replication.replSetName (mongod).

What happened instead?

It receives mongodb instead.

Screenshots

Screenshot 2025-01-06 at 16 17 10

Screenshot 2025-01-06 at 15 39 50

Additional context

We are looking to migrate 100+ clusters to the operator and this is the last piece in the puzzle.

I confirmed this by stopping the operator and manually changing the _id to rs0 in secret mongodb-config and then seeing the agent become ready.

After starting the operator again, it then undos the changes to the secret, of course.

I had a thorough look through the relevant sources and didn't find a way to fix this with the current implementation.

The automation config builder name is set to the mdb.Name here:

and then Id is set to b.name here:

b.name is also used to set replication.replSetName here:

so this might suggest that the override solution would be preferable.

Did I miss something?

If I could contribute a fix for this, what should I look out for in my implementation to make it acceptable?


  • yaml definitions of your MongoDB Deployment(s):
apiVersion: mongodbcommunity.mongodb.com/v1
kind: MongoDBCommunity
metadata:
  name: mongodb
spec:
  members: 3
  type: ReplicaSet
  version: "5.0.28"
  # ...
  additionalMongodConfig:
    replication.replSetName: rs0
  • The agent clusterconfig of the faulty members:
{"version":2,"processes":[{"name":"mongodb-0","disabled":false,"hostname":"mongodb-0.service-mongodb.mongodb.svc.test.kubernetes.local.mixxt.net","args2_6":{"net":{"port":27017},"replication":{"replSetName":"rs0"},"security":{"transitionToAuth":true},"storage":{"dbPath":"/data","wiredTiger":{"engineConfig":{"cacheSizeGB":"0.5","journalCompressor":"zlib"}}}},"featureCompatibilityVersion":"5.0","processType":"mongod","version":"5.0.28","authSchemaVersion":5},{"name":"mongodb-1","disabled":false,"hostname":"mongodb-1.service-mongodb.mongodb.svc.test.kubernetes.local.mixxt.net","args2_6":{"net":{"port":27017},"replication":{"replSetName":"rs0"},"security":{"transitionToAuth":true},"storage":{"dbPath":"/data","wiredTiger":{"engineConfig":{"cacheSizeGB":"0.5","journalCompressor":"zlib"}}}},"featureCompatibilityVersion":"5.0","processType":"mongod","version":"5.0.28","authSchemaVersion":5},{"name":"mongodb-2","disabled":false,"hostname":"mongodb-2.service-mongodb.mongodb.svc.test.kubernetes.local.mixxt.net","args2_6":{"net":{"port":27017},"replication":{"replSetName":"rs0"},"security":{"transitionToAuth":true},"storage":{"dbPath":"/data","wiredTiger":{"engineConfig":{"cacheSizeGB":"0.5","journalCompressor":"zlib"}}}},"featureCompatibilityVersion":"5.0","processType":"mongod","version":"5.0.28","authSchemaVersion":5}],"replicaSets":[{"_id":"mongodb","members":[{"_id":0,"host":"mongodb-0","arbiterOnly":false,"votes":1,"priority":1},{"_id":1,"host":"mongodb-1","arbiterOnly":false,"votes":1,"priority":1},{"_id":2,"host":"mongodb-2","arbiterOnly":false,"votes":1,"priority":1}],"protocolVersion":"1","numberArbiters":0}],"auth":{"usersWanted":[{"mechanisms":[],"roles":[{"role":"clusterAdmin","db":"admin"},{"role":"userAdminAnyDatabase","db":"admin"}],"user":"tixxt","db":"admin","authenticationRestrictions":[],"scramSha256Creds":{"iterationCount":15000,"salt":"ntalm9D0Jj4Dt0zYRfBlihEwyc2U/FUfkBsMLQ==","serverKey":"hHycYJGOBRUWuJ/B92ZgY+6zDeyaLA7sVa/ILPADmOw=","storedKey":"pcWfo77fY5yIzGSB1YNsBKvqEHlWaaMCPE1OFc0iZic="},"scramSha1Creds":{"iterationCount":10000,"salt":"a0WKosyWCXTivtM1ywCw+Q==","serverKey":"W04zj/xVioHvjRNjaqH8Di02I8k=","storedKey":"0IjhW1a5s2flWcDoq/TUsWyO6Z4="}}],"disabled":false,"authoritativeSet":false,"autoAuthMechanisms":["SCRAM-SHA-256"],"autoAuthMechanism":"SCRAM-SHA-256","deploymentAuthMechanisms":["SCRAM-SHA-256"],"autoUser":"mms-automation","key":"qENSGM4WjMUAtkLPGBgSos3NqgTeRde9HMir6DZ+mil8M259JlcKJcEP33pIDOQXHrGommQrj7CzAnaRmFl6FvfXQgW+2dqo6yAtt3lIUFBr9fUP6vfqNLvBPD2QQL4s+LG3vwwud0G8Pnvr6ZksJUxHqdljXd8SYnootmEs6UtHyC8G1/8m0EHwNu+ez2Wg7+3naenpSxIxGaLR0ZWnTiCuejiFU3M21m3jJ91pBXSWmi6vzKEKQlAkyhy9Ur4z2X29U0wkpVcAwSHbhvNksWpaBo13ZuHRQSyaobKhyX3MlEL8pwyQAqSlrjEAP7oqCZvF5l3Cxkjp3ekmEG/w15q5gjZ/fhKad+2blJLw72F/dwMWQrI9gBPoXteaQ9qFh06E9a+qZrBFEULSuLKqXp2C8/p4fzsBo0Dp5Eg2Xbg0/I0wYKgu4uYVziKrurqJ72Ko7osK4FfPHqBDk9b5nq2DZ8IihIBgC8NNCNEDypbZi24nkvqgBUEmTb5I4ZkZo5W3ZK7kSZL1L0QnVY7xxL5suCPFyEsCzV2bZ+NNUDRMIJ/H9edblZCu4MLd0Mr/RHIG4hfCkFtAxO3bOEMpZR1gsTheS9LMMgK+mTZ5vDXKY8iJhOKp9CGdZt2dFqcNVbiW+3LUJa00ar2S1YllozOBzzU=","keyfile":"/var/lib/mongodb-mms-automation/authentication/keyfile","keyfileWindows":"%SystemDrive%\\MMSAutomation\\versions\\keyfile","autoPwd":"sG3uBaAQisy09xjuz8dG"},"tls":{"CAFilePath":"","clientCertificateMode":"OPTIONAL"},"mongoDbVersions":[{"name":"5.0.28","builds":[{"platform":"linux","url":"","gitVersion":"","architecture":"amd64","flavor":"rhel","minOsVersion":"","maxOsVersion":"","modules":[]},{"platform":"linux","url":"","gitVersion":"","architecture":"amd64","flavor":"ubuntu","minOsVersion":"","maxOsVersion":"","modules":[]},{"platform":"linux","url":"","gitVersion":"","architecture":"aarch64","flavor":"ubuntu","minOsVersion":"","maxOsVersion":"","modules":[]},{"platform":"linux","url":"","gitVersion":"","architecture":"aarch64","flavor":"rhel","minOsVersion":"","maxOsVersion":"","modules":[]}]}],"backupVersions":[],"monitoringVersions":[],"options":{"downloadBase":"/var/lib/mongodb-mms-automation"}}
  • The agent health status of the faulty members:
{"statuses":{"mongodb-0":{"IsInGoalState":false,"LastMongoUpTime":1736175160,"ExpectedToBeUp":true,"ReplicationStatus":1}},"mmsStatus":{"mongodb-0":{"name":"mongodb-0","lastGoalVersionAchieved":1,"plans":[{"automationConfigVersion":1,"started":"2025-01-06T13:24:00.086578872Z","completed":null,"moves":[{"move":"Start","moveDoc":"Start the process","steps":[{"step":"StartFresh","stepDoc":"Start a mongo instance  (start fresh)","isWaitStep":false,"started":"2025-01-06T13:24:00.08660494Z","completed":"2025-01-06T13:24:09.865099778Z","result":"success"}]},{"move":"WaitAllRsMembersUp","moveDoc":"Wait until all members of this process' repl set are up","steps":[{"step":"WaitAllRsMembersUp","stepDoc":"Wait until all members of this process' repl set are up","isWaitStep":true,"started":"2025-01-06T13:24:09.865192101Z","completed":null,"result":"wait"}]},{"move":"RsInit","moveDoc":"Initialize a replica set including the current MongoDB process","steps":[{"step":"RsInit","stepDoc":"Initialize a replica set","isWaitStep":false,"started":null,"completed":null,"result":""}]},{"move":"WaitFeatureCompatibilityVersionCorrect","moveDoc":"Wait for featureCompatibilityVersion to be right","steps":[{"step":"WaitFeatureCompatibilityVersionCorrect","stepDoc":"Wait for featureCompatibilityVersion to be right","isWaitStep":true,"started":null,"completed":null,"result":""}]}]},{"automationConfigVersion":1,"started":"2025-01-06T13:35:35.257483563Z","completed":"2025-01-06T13:35:39.182499358Z","moves":[{"move":"EnsureAutomationCredentials","moveDoc":"Ensure the automation user exists","steps":[{"step":"EnsureAutomationCredentials","stepDoc":"Ensure the automation user exists","isWaitStep":false,"started":"2025-01-06T13:35:35.257505082Z","completed":"2025-01-06T13:35:39.022009499Z","result":"success"}]},{"move":"AdjustUsers","moveDoc":"Adjust Users","steps":[{"step":"AdjustUsers","stepDoc":"Adjust Users","isWaitStep":false,"started":"2025-01-06T13:35:39.022147572Z","completed":"2025-01-06T13:35:39.182354797Z","result":"success"}]}]}],"errorCode":104,"errorString":"\u003cmongodb-0\u003e [14:51:42.024] Failed to find a plan!","waitDetails":{"WaitAllRsMembersUp":"[]","WaitAuthSchemaCorrect":"auth schema will be updated by the primary","WaitCanStartFresh":"process not up","WaitCannotBecomePrimary":"Wait until the process is reconfigured with priority=0 by a different process","WaitDefaultRWConcernCorrect":"waiting for the primary to update defaultRWConcern","WaitForResyncPrimaryManualInterventionStep":"A resync was requested on a primary. This requires manual intervention","WaitHealthyMajority":"[]","WaitMultipleHealthyNonArbiters":"[]","WaitNecessaryRsMembersUpForReconfig":"[]","WaitPrimary":"This process is expected to be the primary member. Check that the replica set state allows a primary to be elected","WaitProcessUp":"The process is running, but not yet responding to agent calls","WaitResetPlacementHistory":"config servers  haven't seen the marker"}}}
  • The verbose agent logs of the faulty members:
[2025-01-06T14:53:42.436+0000] [.warn] [src/director/director.go:computePlan:297] <mongodb-0> [14:53:42.436] ... No plan could be found - not in goal state because of:
[All the following are false:
    ['desiredState.ReplSetConf' != <nil> ('desiredState.ReplSetConf' = ReplSetConfig{id=mongodb,version=0,commitmentStatus=false,configsvr=false,protocolVersion=1,forceProtocolVersion=false,writeConcernMajorityJournalDefault=,members={id:0,HostPort:mongodb-0.service-mongodb.mongodb.svc.test.kubernetes.local.mixxt.net:27017,ArbiterOnly:falsePriority:1,Hidden:false,SecondaryDelaySecs:0,Votes:1,Tags:map[]},{id:1,HostPort:mongodb-1.service-mongodb.mongodb.svc.test.kubernetes.local.mixxt.net:27017,ArbiterOnly:falsePriority:1,Hidden:false,SecondaryDelaySecs:0,Votes:1,Tags:map[]},{id:2,HostPort:mongodb-2.service-mongodb.mongodb.svc.test.kubernetes.local.mixxt.net:27017,ArbiterOnly:falsePriority:1,Hidden:false,SecondaryDelaySecs:0,Votes:1,Tags:map[]},settings=map[]})]
    ['currentState.ReplSetConf.Id' != 'desiredState.ReplSetConf.Id' : (rs0) vs. (mongodb)]

Also seen previously:

(InvalidReplicaSetConfig) Rejecting initiate with a set name that differs from command line set name, initiate set name: mongodb, command line set name: rs0

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions