Skip to content

Commit a1d2c80

Browse files
doxiaorjeberhard
andauthored
OWLS-88569 add events for rolling restart (#2364)
* add scopes to compatibility check logic, and add three roll events and log messages Co-authored-by: Ryan Eberhard <ryan.eberhard@oracle.com>
1 parent 9f237cd commit a1d2c80

24 files changed

+1727
-58
lines changed

documentation/staging/content/userguide/managing-domains/domain-events.md

Lines changed: 109 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -27,9 +27,22 @@ The operator generates these event types in a domain namespace, which indicate t
2727
* `DomainDeleted`: An existing domain has been deleted.
2828
* `DomainProcessingStarting`: The operator has started to process a new domain or to update an existing domain. This event may be a result of a `DomainCreate`, `DomainChanged`, or `DomainDeleted` event, or a result of a retry after a failed attempt.
2929
* `DomainProcessingFailed`: The operator has encountered a problem while it was processing the domain resource. The failure either could be a configuration error or a Kubernetes API error.
30-
* `DomainProcessingRetrying`: The operator is going to retry the processing of a domain after it encountered an failure.
30+
* `DomainProcessingRetrying`: The operator is going to retry the processing of a domain after it encountered a failure.
3131
* `DomainProcessingCompleted`: The operator successfully completed the processing of a domain resource.
3232
* `DomainProcessingAborted`: The operator stopped processing a domain when the operator encountered a fatal error or a failure that persisted after the specified maximum number of retries.
33+
* `DomainRollStarting`: The operator has detected domain resource or Model in Image model
34+
updates that require it to perform a rolling restart of the domain.
35+
If the domain roll is due to a change to domain resource fields
36+
`image`, `imagePullPolicy`, `livenessProbe`, `readinessProbe`, `restartVersion`,
37+
`domainHome`, `includeServerOutInPodLog`, or `logHome`, then
38+
the event message reports the field name plus its old and new values.
39+
If the domain roll is due to other domain resource changes that cause servers to be restarted
40+
(see [full list of fields that cause servers to be restarted]({{< relref "/userguide/managing-domains/domain-lifecycle/startup#fields-that-cause-servers-to-be-restarted" >}})),
41+
then the event message simply reports that the domain resource has changed.
42+
If the domain roll is due to a Model in Image model update,
43+
then the event message reports there has been a change in the WebLogic domain configuration without the details.
44+
* `DomainRollCompleted`: The operator has successfully completed a rolling restart of a domain.
45+
* `PodCycleStarting`: The operator has started to replace a server pod after it detects that the current pod does not conform to the current domain resource or WebLogic domain configuration.
3346
* `DomainValidationError`: A validation error or warning is found in a domain resource. Please refer to the event message for details.
3447
* `NamespaceWatchingStarted`: The operator has started watching for domains in a namespace.
3548
* `NamespaceWatchingStopped`: The operator has stopped watching for domains in a namespace. Note that the creation of this event in a domain namespace is the operator's best effort only; the event will not be generated if the required Kubernetes privilege is removed when a namespace is no longer managed by the operator.
@@ -285,3 +298,98 @@ Source:
285298
Type: Normal
286299
Events: <none>
287300
```
301+
302+
Example of the sequence of operator generated events in a domain rolling restart after the domain resource's `image` and `logHomeEnabled` changed, which is the output of the command `kubectl get events -n sample-domain1-ns --selector=weblogic.domainUID=sample-domain1,weblogic.createdByOperator=true --sort-by=lastTimestamp'.
303+
304+
```
305+
LAST SEEN TYPE REASON OBJECT MESSAGE
306+
2m58s Normal DomainChanged domain/sample-domain1 Domain resource sample-domain1 was changed
307+
2m58s Normal DomainProcessingStarting domain/sample-domain1 Creating or updating Kubernetes presence for WebLogic Domain with UID sample-domain1
308+
2m58s Normal DomainRollStarting domain/sample-domain1 Rolling restart WebLogic server pods in domain sample-domain1 because: 'image' changed from 'oracle/weblogic' to 'oracle/weblogic:14.1.1.0',
309+
'logHome' changed from 'null' to '/shared/logs/sample-domain1'
310+
2m58s Normal PodCycleStarting domain/sample-domain1 Replacing pod sample-domain1-adminserver because: In container 'weblogic-server':
311+
'image' changed from 'oracle/weblogic' to 'oracle/weblogic:14.1.1.0',
312+
env 'LOG_HOME' changed from 'null' to '/shared/logs/sample-domain1'
313+
2m7s Normal PodCycleStarting domain/sample-domain1 Replacing pod sample-domain1-managed-server1 because: In container 'weblogic-server':
314+
'image' changed from 'oracle/weblogic' to 'oracle/weblogic:14.1.1.0',
315+
env 'LOG_HOME' changed from 'null' to '/shared/logs/sample-domain1'
316+
71s Normal PodCycleStarting domain/sample-domain1 Replacing pod sample-domain1-managed-server2 because: In container 'weblogic-server':
317+
'image' changed from 'oracle/weblogic' to 'oracle/weblogic:14.1.1.0',
318+
env 'LOG_HOME' changed from 'null' to '/shared/logs/sample-domain1'
319+
19s Normal DomainRollCompleted domain/sample-domain1 Rolling restart of domain sample-domain1 completed
320+
19s Normal DomainProcessingCompleted domain/sample-domain1 Successfully completed processing domain resource sample-domain1
321+
322+
```
323+
324+
Example of a `DomainRollStarting` event:
325+
326+
```
327+
Name: sample-domain1.DomainRollStarting.7d33e9b787e9c318
328+
Namespace: sample-domain1-ns
329+
Labels: weblogic.createdByOperator=true
330+
weblogic.domainUID=sample-domain1
331+
Annotations: <none>
332+
API Version: v1
333+
Count: 1
334+
Event Time: <nil>
335+
First Timestamp: 2021-05-18T02:00:24Z
336+
Involved Object:
337+
API Version: weblogic.oracle/v8
338+
Kind: Domain
339+
Name: sample-domain1
340+
Namespace: sample-domain1-ns
341+
UID: 5df7dcda-d606-4509-9a06-32f25e16e166
342+
Kind: Event
343+
Last Timestamp: 2021-05-18T02:00:24Z
344+
Message: Rolling restart WebLogic server pods in domain sample-domain1 because: 'image' changed from 'oracle/weblogic' to 'oracle/weblogic:14.1.1.0',
345+
'logHome' changed from 'null' to '/shared/logs/sample-domain1'
346+
Metadata:
347+
Creation Timestamp: 2021-05-18T02:00:24Z
348+
Resource Version: 12842363
349+
Self Link: /api/v1/namespaces/sample-domain1-ns/events/sample-domain1.DomainRollStarting.7d33e9b787e9c318
350+
UID: 6ec92655-9d06-43b1-8b26-c01ebccadecf
351+
Reason: DomainRollStarting
352+
Reporting Component: weblogic.operator
353+
Reporting Instance: weblogic-operator-fc4ccc8b5-rh4v6
354+
Source:
355+
Type: Normal
356+
Events: <none>
357+
358+
```
359+
360+
Example of a `PodCycleStarting` event:
361+
362+
```
363+
Name: sample-domain1.PodCycleStarting.7d34bc3232231f49
364+
Namespace: sample-domain1-ns
365+
Labels: weblogic.createdByOperator=true
366+
weblogic.domainUID=sample-domain1
367+
Annotations: <none>
368+
API Version: v1
369+
Count: 1
370+
Event Time: <nil>
371+
First Timestamp: 2021-05-18T02:01:18Z
372+
Involved Object:
373+
API Version: weblogic.oracle/v8
374+
Kind: Domain
375+
Name: sample-domain1
376+
Namespace: sample-domain1-ns
377+
UID: 5df7dcda-d606-4509-9a06-32f25e16e166
378+
Kind: Event
379+
Last Timestamp: 2021-05-18T02:01:18Z
380+
Message: Replacing pod sample-domain1-managed-server1 because: In container 'weblogic-server':
381+
'image' changed from 'oracle/weblogic' to 'oracle/weblogic:14.1.1.0',
382+
env 'LOG_HOME' changed from 'null' to '/shared/logs/sample-domain1'
383+
Metadata:
384+
Creation Timestamp: 2021-05-18T02:01:18Z
385+
Resource Version: 12842530
386+
Self Link: /api/v1/namespaces/sample-domain1-ns/events/sample-domain1.PodCycleStarting.7d34bc3232231f49
387+
UID: 4c6a203e-9b93-4b46-b9e3-1a448b52c7ca
388+
Reason: PodCycleStarting
389+
Reporting Component: weblogic.operator
390+
Reporting Instance: weblogic-operator-fc4ccc8b5-rh4v6
391+
Source:
392+
Type: Normal
393+
Events: <none>
394+
395+
```

operator/src/main/java/oracle/kubernetes/operator/EventConstants.java

Lines changed: 9 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -14,7 +14,10 @@ public interface EventConstants {
1414
String DOMAIN_PROCESSING_FAILED_EVENT = "DomainProcessingFailed";
1515
String DOMAIN_PROCESSING_RETRYING_EVENT = "DomainProcessingRetrying";
1616
String DOMAIN_PROCESSING_ABORTED_EVENT = "DomainProcessingAborted";
17+
String DOMAIN_ROLL_COMPLETED_EVENT = "DomainRollCompleted";
18+
String DOMAIN_ROLL_STARTING_EVENT = "DomainRollStarting";
1719
String DOMAIN_VALIDATION_ERROR_EVENT = "DomainValidationError";
20+
String POD_CYCLE_STARTING_EVENT = "PodCycleStarting";
1821
String EVENT_NORMAL = "Normal";
1922
String EVENT_WARNING = "Warning";
2023
String WEBLOGIC_OPERATOR_COMPONENT = "weblogic.operator";
@@ -34,6 +37,7 @@ public interface EventConstants {
3437
= "Aborting the processing of domain resource %s permanently due to: %s";
3538
String DOMAIN_VALIDATION_ERROR_PATTERN
3639
= "Validation error in domain resource %s: %s";
40+
String POD_CYCLE_STARTING_PATTERN = "Replacing pod %s because: %s";
3741
String NAMESPACE_WATCHING_STARTED_EVENT = "NamespaceWatchingStarted";
3842
String NAMESPACE_WATCHING_STARTED_PATTERN = "Started watching namespace %s";
3943
String NAMESPACE_WATCHING_STOPPED_EVENT = "NamespaceWatchingStopped";
@@ -47,4 +51,9 @@ public interface EventConstants {
4751
String STOP_MANAGING_NAMESPACE_PATTERN = "Stop managing namespace %s";
4852
String START_MANAGING_NAMESPACE_FAILED_EVENT = "StartManagingNamespaceFailed";
4953
String START_MANAGING_NAMESPACE_FAILED_PATTERN = "Start managing namespace %s failed due to an authorization error";
54+
String DOMAIN_ROLL_STARTING_PATTERN = "Rolling restart WebLogic server pods in domain %s because: %s";
55+
String DOMAIN_ROLL_COMPLETED_PATTERN = "Rolling restart of domain %s completed";
56+
String ROLL_REASON_DOMAIN_RESOURCE_CHANGED = "domain resource changed";
57+
String ROLL_REASON_WEBLOGIC_CONFIGURATION_CHANGED
58+
= "WebLogic domain configuration changed due to a Model in Image model update";
5059
}

operator/src/main/java/oracle/kubernetes/operator/ProcessingConstants.java

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -35,6 +35,7 @@ public interface ProcessingConstants {
3535
String MII_DYNAMIC_UPDATE_WDTROLLBACKFILE = "miiDynamicUpdateRollbackFile";
3636
String MII_DYNAMIC_UPDATE_SUCCESS = "0";
3737
String MII_DYNAMIC_UPDATE_RESTART_REQUIRED = "103";
38+
String DOMAIN_ROLL_START_EVENT_GENERATED = "domainRollStartEventGenerated";
3839

3940
String DOMAIN_VALIDATION_ERRORS = "domainValidationErrors";
4041
String INTROSPECTOR_JOB_FAILURE_LOGGED = "introspectorJobFailureLogged";

operator/src/main/java/oracle/kubernetes/operator/helpers/CollectiveCompatibility.java

Lines changed: 22 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -41,9 +41,30 @@ public String getIncompatibility() {
4141
reasons.add(getIndent() + check.getIncompatibility());
4242
}
4343
}
44-
return reasons.isEmpty() ? null : getHeader() + String.join("\n", reasons);
44+
return reasons.isEmpty() ? null : getHeader() + String.join(",\n", reasons);
4545
}
4646

47+
@Override
48+
public String getScopedIncompatibility(CompatibilityScope scope) {
49+
final List<String> reasons = new ArrayList<>();
50+
for (CompatibilityCheck check : checks) {
51+
if (!check.isCompatible() && check.getScopedIncompatibility(scope) != null) {
52+
reasons.add(getIndent() + check.getScopedIncompatibility(scope));
53+
}
54+
}
55+
return reasons.isEmpty()
56+
? null
57+
: scope == CompatibilityScope.DOMAIN
58+
? String.join(",\n", reasons)
59+
: getHeader() + String.join(",\n", reasons);
60+
}
61+
62+
@Override
63+
public CompatibilityScope getScope() {
64+
return CompatibilityScope.MINIMUM;
65+
}
66+
67+
4768
<T> void addSets(String description, List<T> expected, List<T> actual) {
4869
add(CheckFactory.create(description, expected, actual));
4970
}

operator/src/main/java/oracle/kubernetes/operator/helpers/CompatibilityCheck.java

Lines changed: 64 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -4,10 +4,74 @@
44
package oracle.kubernetes.operator.helpers;
55

66
interface CompatibilityCheck {
7+
enum CompatibilityScope {
8+
DOMAIN {
9+
@Override
10+
public boolean contains(CompatibilityScope scope) {
11+
switch (scope) {
12+
case DOMAIN:
13+
case MINIMUM:
14+
return true;
15+
case POD:
16+
case UNKNOWN:
17+
default:
18+
return false;
19+
}
20+
}
21+
},
22+
POD {
23+
@Override
24+
public boolean contains(CompatibilityScope scope) {
25+
switch (scope) {
26+
case DOMAIN:
27+
case POD:
28+
case UNKNOWN:
29+
case MINIMUM:
30+
return true;
31+
default:
32+
return false;
33+
34+
}
35+
}
36+
},
37+
UNKNOWN {
38+
@Override
39+
public boolean contains(CompatibilityScope scope) {
40+
switch (scope) {
41+
case DOMAIN:
42+
case POD:
43+
default:
44+
return false;
45+
case UNKNOWN:
46+
case MINIMUM:
47+
return true;
48+
}
49+
}
50+
},
51+
MINIMUM {
52+
@Override
53+
public boolean contains(CompatibilityScope scope) {
54+
switch (scope) {
55+
case DOMAIN:
56+
case POD:
57+
case UNKNOWN:
58+
default:
59+
return false;
60+
}
61+
}
62+
};
63+
64+
public abstract boolean contains(CompatibilityScope scope);
65+
}
66+
767
boolean isCompatible();
868

969
String getIncompatibility();
1070

71+
String getScopedIncompatibility(CompatibilityScope scope);
72+
73+
CompatibilityScope getScope();
74+
1175
default CompatibilityCheck ignoring(String... keys) {
1276
return this;
1377
}

0 commit comments

Comments
 (0)