Skip to content
This repository was archived by the owner on Jan 18, 2023. It is now read-only.

Commit d4fbc7c

Browse files
committed
Repatch extended resource cmk.intel.com/exclusive-cores after kubelet restart.
Since 1.10 version of Kubernetes Kubelet sets extended resource capacity to zero after it restarts. To repatch extended resource cmk.intel.com/exclusive-cores "discovery" container added to reconcile-report daemonset. Signed-off-by: Liliia Butorina <l.butorina@partner.samsung.com>
1 parent cc50f8f commit d4fbc7c

File tree

8 files changed

+23
-15
lines changed

8 files changed

+23
-15
lines changed

cmk.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -47,7 +47,7 @@
4747
software.
4848
--cmk-cmd-list=<list> Comma seperated list of CMK sub-commands to run
4949
on each host
50-
[default: init,reconcile,install,discover,nodereport].
50+
[default: init,install,discover,rediscover,reconcile,nodereport].
5151
--cmk-img=<img> CMK Docker image [default: cmk:v1.3.1].
5252
--cmk-img-pol=<pol> Image pull policy for the CMK Docker image
5353
[default: IfNotPresent].

docs/cli.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1030,7 +1030,7 @@ $ docker run -it --volume=/etc/cmk:/etc/cmk:rw \
10301030
### `cmk uninstall`
10311031

10321032
Removes `cmk` from a node. Uninstall process reverts `cmk cluster-init`:
1033-
- deletes `cmk-reconcile-nodereport-pod-{node}` if present
1033+
- deletes `cmk-rediscover-reconcile-nodereport-pod-{node}` if present
10341034
- removes `NodeReport` from Kubernetes ThirdPartyResources if present
10351035
- removes `ReconcileReport` from Kubernetes ThirdPartyResources if present
10361036
- removes cmk node label if present

docs/html/docs/cli.html

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1027,7 +1027,7 @@ <h3>
10271027
</h3>
10281028
<p>Removes <code>cmk</code> from a node. Uninstall process reverts <code>cmk cluster-init</code>:</p>
10291029
<ul>
1030-
<li>deletes <code>cmk-reconcile-nodereport-pod-{node}</code> if present</li>
1030+
<li>deletes <code>cmk-rediscover-reconcile-nodereport-pod-{node}</code> if present</li>
10311031
<li>removes <code>NodeReport</code> from Kubernetes ThirdPartyResources if present</li>
10321032
<li>removes <code>ReconcileReport</code> from Kubernetes ThirdPartyResources if present</li>
10331033
<li>removes cmk node label if present</li>

docs/html/docs/operator.html

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -617,7 +617,7 @@ <h2>
617617
the recommended way to start troubleshooting is to look at the logs using <code>kubectl logs POD_NAME [CONTAINER_NAME] -f</code>.</p>
618618
<p>For example, assuming you ran the <a href="../resources/pods/cmk-cluster-init-pod.yaml">cmk-cluster-init-pod template</a> with default options, it
619619
should create two pods on each node named <code>cmk-init-install-discover-pod-&lt;node-name&gt;</code> and
620-
<code>cmk-reconcile-nodereport-&lt;node-name&gt;</code>, where <code>&lt;node-name&gt;</code> should be replaced with the name of the node.</p>
620+
<code>cmk-rediscover-reconcile-nodereport-&lt;node-name&gt;</code>, where <code>&lt;node-name&gt;</code> should be replaced with the name of the node.</p>
621621
<p>If you want to look at the logs from the container which ran the <code>discover</code> subcommand in the pod, you can use
622622
<code>kubectl logs -f cmk-init-install-discover-pod-&lt;node-name&gt; discover</code></p>
623623
<p>If you want to look at the logs from the container which ran the <code>reconcile</code> subcommand in the pod, you can use

docs/operator.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -567,13 +567,13 @@ the recommended way to start troubleshooting is to look at the logs using `kubec
567567

568568
For example, assuming you ran the [cmk-cluster-init-pod template][cluster-init-template] with default options, it
569569
should create two pods on each node named `cmk-init-install-discover-pod-<node-name>` and
570-
`cmk-reconcile-nodereport-<node-name>`, where `<node-name>` should be replaced with the name of the node.
570+
`cmk-rediscover-reconcile-nodereport-<node-name>`, where `<node-name>` should be replaced with the name of the node.
571571

572572
If you want to look at the logs from the container which ran the `discover` subcommand in the pod, you can use
573573
`kubectl logs -f cmk-init-install-discover-pod-<node-name> discover`
574574

575575
If you want to look at the logs from the container which ran the `reconcile` subcommand in the pod, you can use
576-
`kubectl logs -f cmk-reconcile-nodereport-pod-<node-name> reconcile`
576+
`kubectl logs -f cmk-rediscover-reconcile-nodereport-pod-<node-name> reconcile`
577577

578578
If you want to remove `cmk` use `cmk-uninstall-pod.yaml`. [nodeSelector](https://kubernetes.io/docs/user-guide/node-selection)
579579
can help to fine-grain the deletion for specific node.

intel/clusterinit.py

Lines changed: 13 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -36,7 +36,7 @@ def cluster_init(host_list, all_hosts, cmd_list, cmk_img, cmk_img_pol,
3636

3737
# Check if all the flag values passed are valid.
3838
# Check if cmk_cmd_list is valid.
39-
valid_cmd_list = ["init", "discover", "install", "reconcile", "nodereport"]
39+
valid_cmd_list = ["init", "discover", "install", "rediscover", "reconcile", "nodereport"]
4040
for cmk_cmd in cmk_cmd_list:
4141
if cmk_cmd not in valid_cmd_list:
4242
raise RuntimeError("CMK command should be one of {}"
@@ -147,6 +147,8 @@ def run_cmd_pods(cmd_list, cmd_init_list, cmk_img, cmk_img_pol, conf_dir,
147147
args = "/cmk/cmk.py isolate --pool=infra /cmk/cmk.py -- reconcile --interval=5 --publish" # noqa: E501
148148
elif cmd == "nodereport":
149149
args = "/cmk/cmk.py isolate --pool=infra /cmk/cmk.py -- node-report --interval=5 --publish" # noqa: E501
150+
elif cmd == "rediscover":
151+
args = "/cmk/cmk.py isolate --pool=infra /cmk/cmk.py -- discover; sleep infinity" # noqa: E501
150152

151153
update_pod_with_container(pod, cmd, cmk_img, cmk_img_pol, args)
152154
elif cmd_init_list:
@@ -178,10 +180,10 @@ def run_cmd_pods(cmd_list, cmd_init_list, cmk_img, cmk_img_pol, conf_dir,
178180

179181
for node_name in cmk_node_list:
180182
if cmd_list:
181-
update_pod_with_node_details(pod, node_name, cmd_list)
183+
update_pod_with_node_details(pod, node_name, cmd_list, "ds")
182184
daemon_set = k8s.ds_from(pod=pod)
183185
elif cmd_init_list:
184-
update_pod_with_node_details(pod, node_name, cmd_init_list)
186+
update_pod_with_node_details(pod, node_name, cmd_init_list, "pod")
185187

186188
try:
187189
if cmd_list:
@@ -315,8 +317,9 @@ def wait_for_pod_phase(pod_name, phase_name):
315317
sys.exit(1)
316318

317319
for pod in pod_list_resp["items"]:
318-
if ("metadata" in pod) and ("name" in pod["metadata"]) \
319-
and pod_name in pod["metadata"]["name"]:
320+
if ("metadata" in pod) and ("labels" in pod["metadata"]) \
321+
and ("podname" in pod["metadata"]["labels"]) \
322+
and (pod_name == pod["metadata"]["labels"]["podname"]):
320323
if pod["status"]["phase"] == phase_name:
321324
wait = False
322325
break
@@ -333,10 +336,13 @@ def update_pod(pod, restart_pol, conf_dir, install_dir, serviceaccount):
333336
pod["spec"]["volumes"][2]["hostPath"]["path"] = install_dir
334337

335338

336-
def update_pod_with_node_details(pod, node_name, cmd_list):
339+
def update_pod_with_node_details(pod, node_name, cmd_list, res_type):
337340
pod["spec"]["nodeName"] = node_name
338-
pod_name = "cmk-{}-pod-{}".format("-".join(cmd_list), node_name)
341+
pod_name = "cmk-{}-{}-{}".format("-".join(cmd_list), res_type, node_name)
339342
pod["metadata"]["name"] = pod_name
343+
# name max length is 63, so move to labels key-value
344+
pod["metadata"]["labels"] = {"podname": pod_name}
345+
logging.info("Created pod name: {}".format(pod_name))
340346

341347

342348
def update_pod_with_pull_secret(pod, pull_secret):

intel/k8s.py

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -69,7 +69,9 @@ def ds_from(pod):
6969
"metadata": {
7070
"labels": {
7171
"app":
72-
pod["metadata"]["name"].replace("pod", "ds")
72+
pod["metadata"]["name"].replace("pod", "ds"),
73+
"podname":
74+
pod["metadata"]["labels"]["podname"]
7375
}
7476
},
7577
"spec": pod["spec"]

intel/uninstall.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -32,7 +32,7 @@
3232
def uninstall(install_dir, conf_dir, namespace):
3333
delete_cmk_pod("cmk-init-install-discover-pod", namespace,
3434
postfix=os.getenv("NODE_NAME"))
35-
delete_cmk_pod("cmk-reconcile-nodereport-ds", namespace,
35+
delete_cmk_pod("cmk-rediscover-reconcile-nodereport-ds", namespace,
3636
postfix=os.getenv("NODE_NAME"))
3737

3838
delete_cmk_pod("cmk-node-report-ds-all", namespace)

0 commit comments

Comments
 (0)