Add drop cache when change nodes state #1947

bkopilov · 2022-12-12T09:03:12Z

Current iptables fixture tears down nodes in order modify iptables. When the machines started again the cached info must be refreshed.

In case machines state changed like stop/reboot cashed info may hide bugs in the node controllers or fail tests.

openshift-ci · 2022-12-12T09:03:58Z

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: bkopilov
Once this PR has been reviewed and has the lgtm label, please assign eranco74 for approval by writing /assign @eranco74 in a comment. For more information see the Kubernetes Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

OWNERS

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

openshift-ci · 2022-12-12T09:04:07Z

Hi @bkopilov. Thanks for your PR.

I'm waiting for a openshift member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

bkopilov · 2022-12-12T16:26:42Z

@osherdp @lalon4

Please review

adriengentil · 2022-12-12T16:43:11Z

/ok-to-test

Would it make sense to drop the cache directly in shutdown_given and shutdown_all? So everytime shutdown is called, the cache would be dropped and potentially prevent potential discrepancies in other parts of test-infra.

assisted-test-infra/src/assisted_test_infra/test_infra/helper_classes/nodes.py

Line 112 in 12e7865

def shutdown_given(self, nodes):

bkopilov · 2022-12-12T17:24:18Z

/ok-to-test

Would it make sense to drop the cache directly in shutdown_given and shutdown_all? So everytime shutdown is called, the cache would be dropped and potentially prevent potential discrepancies in other parts of test-infra.

assisted-test-infra/src/assisted_test_infra/test_infra/helper_classes/nodes.py

Line 112 in 12e7865

def shutdown_given(self, nodes):

[bkopilov] Yes , the cache is a reall risk if something goes wrong during tests or events.

osherdp · 2022-12-12T17:32:24Z

src/tests/base_test.py

@@ -644,7 +644,8 @@ def set_iptables_rules_for_nodes(
            for node in given_nodes:
                given_node_ips.append(node.ips[0])
            cluster.nodes.shutdown_given(given_nodes)
-
+            # Clear cached info about nodes after shutdown
+            cluster.nodes.drop_cache()


Why we need to do the shutdown in the first place? Any chance you know?

The iptables fixture from base_test.py run shutdown, no idea if we really need shutdown.
We have failures in CI in tests that use this fixture and i suspect that when the libvirt comes back we dont have the latest ip.

So let's try to remove the shutdown if it serves no purpose
There's no reason to just do more actions and increase our chances of failure

How do you recommend to fix all other cases when we stop/ reboot nodes during test lifetime ?

Maybe we can drop the cache for those other cases (which are outside of this code scope)
Can you send me some links? I'll take a look at those tests

No, lets not remove the shutdown, it was implemented long time ago, there was a reason - if i am not mistaken if for the rules to apply, proxy tests are highly depended on it

Let me check this a bit
Either way Adrien's suggestion to embed the cache-drop in the shutdown function is a better way to construct this solution

Ack , yes Adrien's change will cover our needs but please note that we still may have an issue if user will try to call to Node().shutdown() . but it good enough ,if no objection i am going to drop_cahce for: shutdown and reboot

Seems like a good compromise

Iptables check does not work when having a chain of sources When we insert a rule with multiple source, iptables split the list and adding each source with same conditions On check action iptables wont catch case when having a mix of source when 1 is not in the table. looks like iptables check only one. In case check action returned wrong result add/delete rule chnaged and may trigger failures due to unexpected results Also deletion , iptables delete only the first match when having duplicate rules. Example: iptables --insert FORWARD -p tcp -j DROP -s 192.168.128.10,192.168.128.13 results with: -A FORWARD -s 192.168.128.13/32 -p tcp -m tcp --dport 80 -j DROP -A FORWARD -s 192.168.128.10/32 -p tcp -m tcp --dport 80 -j DROP We will get different result from iptables for: iptables --check FORWARD -p tcp -j DROP -s 7.7.7.7,192.168.128.10 iptables --check FORWARD -p tcp -j DROP -s 192.168.128.10,7.7.7.7 Means that only 1 is the best approch here. Run delete rules till all duplicate rules gone

openshift-ci · 2022-12-15T11:53:16Z

@bkopilov: The following test failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name	Commit	Details	Required	Rerun command
ci/prow/lint	`562ebc0`	link	true	`/test lint`

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

osherdp · 2022-12-15T11:58:36Z

Code has changed into another change
For now I'm waiting for it to go back to original requested change
/hold

When the machines started again the cached info must be refreshed. In case machines state changed like stop/reboot cashed info may hide bugs in the node controllers or fail tests. This patch is continuation of openshift#1947 Due to some mismach i created new patch.

osherdp · 2022-12-18T08:47:48Z

@bkopilov can we close this PR?
I see there's a replacement

bkopilov · 2022-12-18T08:50:02Z

@bkopilov can we close this PR? I see there's a replacement

Yes please .

osherdp · 2022-12-18T08:51:32Z

/close

openshift-ci · 2022-12-18T08:51:43Z

@osherdp: Closed this PR.

In response to this:

/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

…1954) When the machines started again the cached info must be refreshed. In case machines state changed like stop/reboot cashed info may hide bugs in the node controllers or fail tests. This patch is continuation of #1947 Due to some mismach i created new patch.

openshift-ci bot added the size/XS Denotes a PR that changes 0-9 lines, ignoring generated files. label Dec 12, 2022

openshift-ci bot requested review from adriengentil and omertuc December 12, 2022 09:03

openshift-ci bot added the needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. label Dec 12, 2022

openshift-ci bot added ok-to-test Indicates a non-member PR verified by an org member that is safe to test. and removed needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels Dec 12, 2022

osherdp reviewed Dec 12, 2022

View reviewed changes

bkopilov force-pushed the fix_iptables_fixture branch from 40036ce to 562ebc0 Compare December 15, 2022 08:33

openshift-ci bot added size/S Denotes a PR that changes 10-29 lines, ignoring generated files. and removed size/XS Denotes a PR that changes 0-9 lines, ignoring generated files. labels Dec 15, 2022

openshift-ci bot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Dec 15, 2022

bkopilov mentioned this pull request Dec 15, 2022

Current iptables fixture tears down nodes in order modify iptables. #1954

Merged

openshift-ci bot closed this Dec 18, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add drop cache when change nodes state #1947

Add drop cache when change nodes state #1947

bkopilov commented Dec 12, 2022

openshift-ci bot commented Dec 12, 2022

openshift-ci bot commented Dec 12, 2022

bkopilov commented Dec 12, 2022

adriengentil commented Dec 12, 2022

bkopilov commented Dec 12, 2022 •

edited

Loading

osherdp Dec 12, 2022

bkopilov Dec 12, 2022

osherdp Dec 13, 2022

bkopilov Dec 13, 2022

osherdp Dec 13, 2022

lalon4 Dec 13, 2022

osherdp Dec 13, 2022

bkopilov Dec 13, 2022

osherdp Dec 15, 2022

openshift-ci bot commented Dec 15, 2022

osherdp commented Dec 15, 2022

osherdp commented Dec 18, 2022

bkopilov commented Dec 18, 2022

osherdp commented Dec 18, 2022

openshift-ci bot commented Dec 18, 2022

Add drop cache when change nodes state #1947

Add drop cache when change nodes state #1947

Conversation

bkopilov commented Dec 12, 2022

openshift-ci bot commented Dec 12, 2022

openshift-ci bot commented Dec 12, 2022

bkopilov commented Dec 12, 2022

adriengentil commented Dec 12, 2022

bkopilov commented Dec 12, 2022 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

openshift-ci bot commented Dec 15, 2022

osherdp commented Dec 15, 2022

osherdp commented Dec 18, 2022

bkopilov commented Dec 18, 2022

osherdp commented Dec 18, 2022

openshift-ci bot commented Dec 18, 2022

bkopilov commented Dec 12, 2022 •

edited

Loading