Skip to content

Owls85582 take all ALWAYS servers before considering rest of the servers when meet cluster replicas requirement #2020

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 6 commits into from
Nov 3, 2020
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Next Next commit
add all Always servers before consider IfNeeded servers
  • Loading branch information
doxiao committed Oct 29, 2020
commit 599e5b5487e86a00bb2f4ecc06b3190700f6fb2e
Original file line number Diff line number Diff line change
Expand Up @@ -118,23 +118,30 @@ public NextAction apply(Packet packet) {
private void addServersToFactory(@Nonnull ServersUpStepFactory factory, @Nonnull WlsDomainConfig wlsDomainConfig) {
Set<String> clusteredServers = new HashSet<>();

List<ServerConfig> pendingServers = new ArrayList<>();
wlsDomainConfig.getClusterConfigs().values()
.forEach(wlsClusterConfig -> addClusteredServersToFactory(factory, clusteredServers, wlsClusterConfig));
.forEach(wlsClusterConfig -> addClusteredServersToFactory(
factory, clusteredServers, wlsClusterConfig, pendingServers));

wlsDomainConfig.getServerConfigs().values().stream()
.filter(wlsServerConfig -> !clusteredServers.contains(wlsServerConfig.getName()))
.forEach(wlsServerConfig -> factory.addServerIfNeeded(wlsServerConfig, null));
.forEach(wlsServerConfig -> factory.addServerIfAlways(wlsServerConfig, null, pendingServers));

for (ServerConfig serverConfig : pendingServers) {
factory.addServerIfNeeded(serverConfig.wlsServerConfig, serverConfig.wlsClusterConfig);

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would this approach affect the guaranteed 'lexi numeric' order in which we start or shutdown servers? Or which servers are reported in status? (For example, when we're shutting down a cluster's servers one-at-a-time, the goal is to shutdown only the 'highest' server first, then the second highest, and so on.)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The approach considers this. The pending list is in the original order. We have unit test cases covering this too.

Copy link
Member Author

@doxiao doxiao Oct 30, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But the final list does not maintain the original order. For example, if server3 is always and replicas count is 2, server1 and server3 will be started, and server3 will be started before server1.

Copy link
Member Author

@doxiao doxiao Oct 30, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the sample example, if cluster later scales down, server1 will be taken down, which is the correct.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The only behavior difference is the startup order among servers that need to be started up in the same round of make right check; servers with ALWAYS policy will be started before the servers with If-needed policy.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We want the overall startup and shutdown order to be very intuitive, predictable, and the exact reverse of each-other.

Question 1: I think it'd be better if server1 always first in a cluster, then server2, and so on, if they aren't configured to start concurrently -- regardless of whether any of the servers are marked 'always'. So, based on your analysis of the cluster use case ^^^ where server3 (marked always) starts before server1 (marked if_needed), it sounds like this pull may need to be refined?

Question 2: As for shutdown, if replica count is 3, and server1 & 3 are 'if needed' while server2 is always, then reducing replica count to 1 (or 0) should always shutdown server3 first and then shutdown server1 second. (Side note: when setting the entire cluster to NEVER the entire cluster is expected to shutdown concurrently, so no worries there.) Based on your analysis, it sounds like this will be honored?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I understand why you started the ALWAYS servers first, but I agree that it's more important that servers be started (and stopped) in a predictable and consistent order. We've had a few customers ask about startup ordering, so I could see us doing something like that in a future release. I think that means we need to keep the order very, very simple now so that customers would understand what they are configuring later.

Copy link
Member Author

@doxiao doxiao Oct 30, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"Yes" to both questions. Good catch on sequential startup order. I agree we need to resort the final list.

Copy link
Member Author

@doxiao doxiao Oct 30, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

}
}

private void addClusteredServersToFactory(@Nonnull ServersUpStepFactory factory, Set<String> clusteredServers,
@Nonnull WlsClusterConfig wlsClusterConfig) {
private void addClusteredServersToFactory(
@Nonnull ServersUpStepFactory factory, Set<String> clusteredServers,
@Nonnull WlsClusterConfig wlsClusterConfig, List<ServerConfig> pendingServers) {
factory.logIfInvalidReplicaCount(wlsClusterConfig);
// We depend on 'getServerConfigs()' returning an ascending 'numero-lexi'
// sorted list so that a cluster's "lowest named" servers have precedence
// when the cluster's replica count is lower than the WL cluster size.
wlsClusterConfig.getServerConfigs()
.forEach(wlsServerConfig -> {
factory.addServerIfNeeded(wlsServerConfig, wlsClusterConfig);
factory.addServerIfAlways(wlsServerConfig, wlsClusterConfig, pendingServers);
clusteredServers.add(wlsServerConfig.getName());
});
}
Expand Down Expand Up @@ -176,20 +183,15 @@ boolean shouldPrecreateServerService(ServerSpec server) {

private void addServerIfNeeded(@Nonnull WlsServerConfig serverConfig, WlsClusterConfig clusterConfig) {
String serverName = serverConfig.getName();
if (servers.contains(serverName) || serverName.equals(domainTopology.getAdminServerName())) {
if (adminServerOrDone(serverName)) {
return;
}

String clusterName = clusterConfig == null ? null : clusterConfig.getClusterName();
String clusterName = getClusterName(clusterConfig);
ServerSpec server = domain.getServer(serverName, clusterName);

if (server.shouldStart(getReplicaCount(clusterName))) {
servers.add(serverName);
if (shouldPrecreateServerService(server)) {
preCreateServers.add(serverName);
}
addStartupInfo(new ServerStartupInfo(serverConfig, clusterName, server));
addToCluster(clusterName);
addServerToStart(serverConfig, clusterName, server);
} else if (shouldPrecreateServerService(server)) {
preCreateServers.add(serverName);
addShutdownInfo(new ServerShutdownInfo(serverConfig, clusterName, server, true));
Expand All @@ -198,6 +200,15 @@ private void addServerIfNeeded(@Nonnull WlsServerConfig serverConfig, WlsCluster
}
}

private void addServerToStart(@Nonnull WlsServerConfig serverConfig, String clusterName, ServerSpec server) {
servers.add(serverConfig.getName());
if (shouldPrecreateServerService(server)) {
preCreateServers.add(serverConfig.getName());
}
addStartupInfo(new ServerStartupInfo(serverConfig, clusterName, server));
addToCluster(clusterName);
}

boolean exceedsMaxConfiguredClusterSize(WlsClusterConfig clusterConfig) {
if (clusterConfig != null) {
String clusterName = clusterConfig.getClusterName();
Expand Down Expand Up @@ -291,5 +302,41 @@ private void logIfInvalidReplicaCount(WlsClusterConfig clusterConfig) {
logIfReplicasExceedsClusterServersMax(clusterConfig);
logIfReplicasLessThanClusterServersMin(clusterConfig);
}

private void addServerIfAlways(
WlsServerConfig wlsServerConfig,
WlsClusterConfig wlsClusterConfig,
List<ServerConfig> pendingServers) {
String serverName = wlsServerConfig.getName();
if (adminServerOrDone(serverName)) {
return;
}
String clusterName = getClusterName(wlsClusterConfig);
ServerSpec server = domain.getServer(serverName, clusterName);
if (server.alwaysStart()) {
addServerToStart(wlsServerConfig, clusterName, server);
} else {
pendingServers.add(new ServerConfig(wlsClusterConfig, wlsServerConfig));
}
}

private boolean adminServerOrDone(String serverName) {
return servers.contains(serverName) || serverName.equals(domainTopology.getAdminServerName());
}

private static String getClusterName(WlsClusterConfig clusterConfig) {
return clusterConfig == null ? null : clusterConfig.getClusterName();
}

}

private static class ServerConfig {
protected WlsServerConfig wlsServerConfig;
protected WlsClusterConfig wlsClusterConfig;

ServerConfig(WlsClusterConfig cluster, WlsServerConfig server) {
this.wlsClusterConfig = cluster;
this.wlsServerConfig = server;
}
}
}
Original file line number Diff line number Diff line change
Expand Up @@ -25,4 +25,12 @@ public boolean shouldStart(int currentReplicas) {
}
return super.shouldStart(currentReplicas);
}

@Override
public boolean alwaysStart() {
if (isStartAdminServerOnly()) {
return false;
}
return super.alwaysStart();
}
}
Original file line number Diff line number Diff line change
Expand Up @@ -160,4 +160,6 @@ public interface ServerSpec {
String getClusterRestartVersion();

String getServerRestartVersion();

boolean alwaysStart();
}
Original file line number Diff line number Diff line change
Expand Up @@ -142,6 +142,10 @@ private ServerStartPolicy getEffectiveServerStartPolicy() {
.orElse(ServerStartPolicy.getDefaultPolicy());
}

public boolean alwaysStart() {
return ServerStartPolicy.ALWAYS.equals(getEffectiveServerStartPolicy());
}

@Nonnull
@Override
public ProbeTuning getLivenessProbe() {
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -278,6 +278,136 @@ public void whenDomainShutDown_ignoreNonOperatorServices() {
assertThat(getRunningPods(), empty());
}

@Test
public void whenClusterReplicas2_server3WithAlwaysPolicy_establishMatchingPresence() {
domainConfigurator.configureCluster(CLUSTER).withReplicas(2);
domainConfigurator.configureServer(MS_PREFIX + 3).withServerStartPolicy("ALWAYS");

DomainPresenceInfo info = new DomainPresenceInfo(newDomain);
processor.createMakeRightOperation(info).execute();

assertServerPodAndServicePresent(info, ADMIN_NAME);
assertServerPodAndServicePresent(info, MS_PREFIX + 1);
assertServerPodAndServicePresent(info, MS_PREFIX + 3);
assertServerPodNotPresent(info, MS_PREFIX + 2);

assertThat(info.getClusterService(CLUSTER), notNullValue());
}

@Test
public void whenClusterScaleUpToReplicas3_fromReplicas2_server3WithAlwaysPolicy_establishMatchingPresence()
throws JsonProcessingException {
establishPreviousIntrospection(null, Arrays.asList(1, 3));
domainConfigurator.configureCluster(CLUSTER).withReplicas(3);
domainConfigurator.configureServer(MS_PREFIX + 3).withServerStartPolicy("ALWAYS");

DomainPresenceInfo info = new DomainPresenceInfo(newDomain);
processor.createMakeRightOperation(info).execute();

assertServerPodAndServicePresent(info, ADMIN_NAME);
for (Integer i : Arrays.asList(1,2,3)) {
assertServerPodAndServicePresent(info, MS_PREFIX + i);
}
assertThat(info.getClusterService(CLUSTER), notNullValue());

}

@Test
public void whenClusterReplicas3_server3And4WithAlwaysPolicy_establishMatchingPresence() {
domainConfigurator.configureCluster(CLUSTER).withReplicas(2);
domainConfigurator.configureServer(MS_PREFIX + 1).withServerStartPolicy("ALWAYS");
domainConfigurator.configureServer(MS_PREFIX + 3).withServerStartPolicy("ALWAYS");
domainConfigurator.configureServer(MS_PREFIX + 4).withServerStartPolicy("ALWAYS");
DomainPresenceInfo info = new DomainPresenceInfo(newDomain);
processor.createMakeRightOperation(info).execute();

assertServerPodAndServicePresent(info, ADMIN_NAME);
for (Integer i : Arrays.asList(1,3,4)) {
assertServerPodAndServicePresent(info, MS_PREFIX + i);
}
assertServerPodNotPresent(info, MS_PREFIX + 2);

assertThat(info.getClusterService(CLUSTER), notNullValue());
}

@Test
public void whenClusterScaleUpToReplicas4_fromReplicas2_server3And4WithAlwaysPolicy_establishMatchingPresence()
throws JsonProcessingException {
establishPreviousIntrospection(null, Arrays.asList(1, 3, 4));

domainConfigurator.configureServer(MS_PREFIX + 1).withServerStartPolicy("ALWAYS");
domainConfigurator.configureServer(MS_PREFIX + 3).withServerStartPolicy("ALWAYS");
domainConfigurator.configureServer(MS_PREFIX + 4).withServerStartPolicy("ALWAYS");
domainConfigurator.configureCluster(CLUSTER).withReplicas(4);

DomainPresenceInfo info = new DomainPresenceInfo(newDomain);

processor.createMakeRightOperation(info).execute();

assertServerPodAndServicePresent(info, ADMIN_NAME);
for (Integer i : Arrays.asList(1,2,3,4)) {
assertServerPodAndServicePresent(info, MS_PREFIX + i);
}

assertThat(info.getClusterService(CLUSTER), notNullValue());
}

@Test
public void whenClusterScaleDownToReplicas1_fromReplicas2_server3WithAlwaysPolicy_establishMatchingPresence()
throws JsonProcessingException {
establishPreviousIntrospection(null, Arrays.asList(1,3));

domainConfigurator.configureCluster(CLUSTER).withReplicas(1);
domainConfigurator.configureServer(MS_PREFIX + 3).withServerStartPolicy("ALWAYS");

DomainPresenceInfo info = new DomainPresenceInfo(newDomain);
processor.createMakeRightOperation(info).execute();

logRecords.clear();
assertServerPodAndServicePresent(info, ADMIN_NAME);
assertServerPodNotPresent(info, MS_PREFIX + 1);
assertServerPodNotPresent(info, MS_PREFIX + 2);
assertServerPodAndServicePresent(info, MS_PREFIX + 3);

assertThat(info.getClusterService(CLUSTER), notNullValue());
}

@Test
public void whenClusterReplicas2_server1And3And4WithAlwaysPolicy_establishMatchingPresence() {
domainConfigurator.configureCluster(CLUSTER).withReplicas(2);
domainConfigurator.configureServer(MS_PREFIX + 1).withServerStartPolicy("ALWAYS");
domainConfigurator.configureServer(MS_PREFIX + 2).withServerStartPolicy("ALWAYS");
domainConfigurator.configureServer(MS_PREFIX + 3).withServerStartPolicy("ALWAYS");
DomainPresenceInfo info = new DomainPresenceInfo(newDomain);
processor.createMakeRightOperation(info).execute();

assertServerPodAndServicePresent(info, ADMIN_NAME);
for (Integer i : Arrays.asList(1,2,3)) {
assertServerPodAndServicePresent(info, MS_PREFIX + i);
}

assertThat(info.getClusterService(CLUSTER), notNullValue());
}

@Test
public void whenClusterScaleDownToReplicas1_fromReplicas2_server1And2And3WithAlwaysPolicy_establishMatchingPresence()
throws JsonProcessingException {
establishPreviousIntrospection(null, Arrays.asList(1, 2, 3));

// now scale down the cluster
domainConfigurator.configureCluster(CLUSTER).withReplicas(1);
domainConfigurator.configureServer(MS_PREFIX + 1).withServerStartPolicy("ALWAYS");
domainConfigurator.configureServer(MS_PREFIX + 2).withServerStartPolicy("ALWAYS");
domainConfigurator.configureServer(MS_PREFIX + 3).withServerStartPolicy("ALWAYS");
DomainPresenceInfo info = new DomainPresenceInfo(newDomain);
processor.createMakeRightOperation(info).execute();
logRecords.clear();
assertServerPodAndServicePresent(info, ADMIN_NAME);
assertServerPodAndServicePresent(info, MS_PREFIX + 1);
assertServerPodAndServicePresent(info, MS_PREFIX + 2);
assertServerPodAndServicePresent(info, MS_PREFIX + 3);
}

private V1Service createNonOperatorService() {
return new V1Service()
.metadata(
Expand Down Expand Up @@ -383,14 +513,20 @@ public void whenIntrospectionJobNotComplete_waitForIt() throws Exception {
}

private void establishPreviousIntrospection(Consumer<Domain> domainSetup) throws JsonProcessingException {
establishPreviousIntrospection(domainSetup, Arrays.asList(1,2));
}

private void establishPreviousIntrospection(Consumer<Domain> domainSetup, List<Integer> msNumbers)
throws JsonProcessingException {
if (domainSetup != null) {
domainSetup.accept(domain);
domainSetup.accept(newDomain);
}
domainConfigurator.configureCluster(CLUSTER).withReplicas(MIN_REPLICAS);
defineServerResources(ADMIN_NAME);
defineServerResources(getManagedServerName(1));
defineServerResources(getManagedServerName(2));
for (Integer i : msNumbers) {
defineServerResources(getManagedServerName(i));
}
DomainProcessorImpl.registerDomainPresenceInfo(new DomainPresenceInfo(domain));
testSupport.defineResources(createIntrospectorConfigMap(OLD_INTROSPECTION_STATE));
testSupport.doOnCreate(KubernetesTestSupport.JOB, j -> recordJob((V1Job) j));
Expand Down Expand Up @@ -775,6 +911,14 @@ private void assertServerPodAndServicePresent(DomainPresenceInfo info, String se
assertThat(serverName + " pod", info.getServerPod(serverName), notNullValue());
}

private void assertServerPodNotPresent(DomainPresenceInfo info, String serverName) {
assertThat(serverName + " pod", isServerInactive(info, serverName), is(Boolean.TRUE));
}

private boolean isServerInactive(DomainPresenceInfo info, String serverName) {
return info.isServerPodBeingDeleted(serverName) || info.getServerPod(serverName) == null;
}

@Test
public void whenDomainIsNotValid_dontBringUpServers() {
defineDuplicateServerNames();
Expand Down