-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
zebra: handle renamed interfaces at netlink level #8310
Conversation
@polychaeta rereview |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for your contribution to FRR!
Click for style suggestions
To apply these suggestions:
curl -s https://gist.githubusercontent.com/polychaeta/1e285fcedee7a15a19bc8d83269ad6cf/raw/a0ab94368a891b2ec4f7626e3b4840670b205e8c/cr_8310_1616446940.diff | git apply
diff --git a/zebra/if_netlink.c b/zebra/if_netlink.c
index a7e32397d..3b3961d6a 100644
--- a/zebra/if_netlink.c
+++ b/zebra/if_netlink.c
@@ -107,14 +107,13 @@ static void set_if_change(struct interface *ifp, ifindex_t ifi_index,
if (((oifp = if_lookup_by_name_per_ns(zns, if_name)) != NULL)
&& (oifp != ifp)) {
if (IS_ZEBRA_DEBUG_KERNEL)
- zlog_debug(
- "interface name %s renamed to %s",
+ zlog_debug("interface name %s renamed to %s",
oifp->name, ifp->name);
if (if_is_up(oifp))
flog_err(
- EC_LIB_INTERFACE,
- "interface rename detected on up interface from %s to %s, results are uncertain!",
- oifp->name, ifp->name);
+ EC_LIB_INTERFACE,
+ "interface rename detected on up interface from %s to %s, results are uncertain!",
+ oifp->name, ifp->name);
update = true;
}
if_set_name(ifp, if_name);
If you are a new contributor to FRR, please see our contributing guidelines.
After making changes, you do not need to create a new PR. You should perform an amend or interactive rebase followed by a force push.
Continuous Integration Result: SUCCESSFULContinuous Integration Result: SUCCESSFULCongratulations, this patch passed basic tests Tested-by: NetDEF / OpenSourceRouting.org CI System CI System Testrun URL: https://ci1.netdef.org/browse/FRR-FRRPULLREQ-17894/ This is a comment from an automated CI system. Warnings Generated during build:Checkout code: Successful with additional warnings
|
Outdated results 💚Basic BGPD CI results: SUCCESS, 0 tests failed
For details, please contact louberger |
3133551
to
2c6b911
Compare
Continuous Integration Result: SUCCESSFULContinuous Integration Result: SUCCESSFULCongratulations, this patch passed basic tests Tested-by: NetDEF / OpenSourceRouting.org CI System CI System Testrun URL: https://ci1.netdef.org/browse/FRR-FRRPULLREQ-17915/ This is a comment from an automated CI system. Warnings Generated during build:Checkout code: Successful with additional warnings
|
Outdated results 💚Basic BGPD CI results: SUCCESS, 0 tests failed
For details, please contact louberger |
2c6b911
to
cf2de59
Compare
Outdated results 💚Basic BGPD CI results: SUCCESS, 0 tests failed
For details, please contact louberger |
Continuous Integration Result: SUCCESSFULContinuous Integration Result: SUCCESSFULCongratulations, this patch passed basic tests Tested-by: NetDEF / OpenSourceRouting.org CI System CI System Testrun URL: https://ci1.netdef.org/browse/FRR-FRRPULLREQ-17951/ This is a comment from an automated CI system. |
LGTM, |
Hello @vjardin, could you merge this PR please ? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I left a couple of comments, but I also don't understand what is wrong with the current code.
Here is the current code flow in netlink_link_change
when the interface is renamed:
ifp = if_lookup_by_name_per_ns(zns, name);
Lookup for an existing interface by name. It either returns NULL or a pointer to an existing inactive interface with an unknown ifindex.if (ifp == NULL) ifp = if_get_by_name(name, vrf_id);
If there's no inactive interface found on the first step, the new one is created.set_ifindex(ifp, ifi->ifi_index, zns);
- First, this function tries to find an existing interface with the same ifindex. If this is a rename, then
oifp
is a pointer to an old interface. - Then it makes it inactive (resetting the ifindex) by calling
if_delete_update
. - And then it assigns the ifindex to a newly created interface by calling to
if_set_index
.
- First, this function tries to find an existing interface with the same ifindex. If this is a rename, then
The current flow looks completely correct to me. Please, explain what is wrong with it.
zebra/if_netlink.c
Outdated
} | ||
} | ||
if_set_index(ifp, ifi_index); | ||
|
||
oifp = if_lookup_by_name_per_ns(zns, if_name); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here you rewrite the oifp pointer that was obtained by if_lookup_by_index_per_ns
earlier.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this change of code was useless. youre right.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@idryzhov , is it ok for you ?
cf2de59
to
931aa95
Compare
I kept only the code that searches for original interface . |
Outdated results 💚Basic BGPD CI results: SUCCESS, 0 tests failed
For details, please contact louberger |
Continuous Integration Result: SUCCESSFULContinuous Integration Result: SUCCESSFULCongratulations, this patch passed basic tests Tested-by: NetDEF / OpenSourceRouting.org CI System CI System Testrun URL: https://ci1.netdef.org/browse/FRR-FRRPULLREQ-19848/ This is a comment from an automated CI system. |
@idryzhov , I simplfied the code change. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I still don't think it's the right fix.
You don't change the code path for the case when we already have a config for the new name (!CHECK_FLAG(ifp->status, ZEBRA_INTERFACE_ACTIVE)
). So whatever you're trying to fix won't be fixed in this case.
For the case when we don't have a config for the new name, you don't update the name of the old interface object. This is also not correct.
Overall this change makes the code behavior inconsistent. Sometimes we create a new object for a new name and sometimes we don't. The current code is consistent and always creates a new object.
931aa95
to
3d6f254
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for your contribution to FRR!
Click for style suggestions
To apply these suggestions:
curl -s https://gist.githubusercontent.com/polychaeta/0dd38667e025df3f9410fb9eb16f04b8/raw/4e4e8b4772f7c71a54048a2d791a768127e8ee84/cr_8310_1626102668.diff | git apply
diff --git a/lib/if.c b/lib/if.c
index 5707cf238..7db14ec19 100644
--- a/lib/if.c
+++ b/lib/if.c
@@ -232,8 +232,7 @@ struct interface *if_create_ifindex(ifindex_t ifindex, vrf_id_t vrf_id)
}
/* Create new interface structure. */
-extern void if_update_to_new_name(struct interface *ifp,
- const char *name)
+extern void if_update_to_new_name(struct interface *ifp, const char *name)
{
struct vrf *vrf = vrf_get(ifp->vrf_id, NULL);
assert(vrf);
diff --git a/lib/if.h b/lib/if.h
index 647150ae0..f2e751c36 100644
--- a/lib/if.h
+++ b/lib/if.h
@@ -509,8 +509,7 @@ extern int if_cmp_name_func(const char *p1, const char *p2);
* else think before you use VRF_UNKNOWN
*/
extern void if_update_to_new_vrf(struct interface *, vrf_id_t vrf_id);
-extern void if_update_to_new_name(struct interface *ifp,
- const char *name);
+extern void if_update_to_new_name(struct interface *ifp, const char *name);
/* Create new interface, adds to name list only */
extern struct interface *if_create_name(const char *name, vrf_id_t vrf_id);
diff --git a/zebra/if_netlink.c b/zebra/if_netlink.c
index 8b50c1aa9..4fe0094c0 100644
--- a/zebra/if_netlink.c
+++ b/zebra/if_netlink.c
@@ -1592,12 +1592,14 @@ int netlink_link_change(struct nlmsghdr *h, ns_id_t ns_id, int startup)
if (ifp == NULL) {
/* check if interface with same index exists */
- ifp = if_lookup_by_index_per_ns(zns, ifi->ifi_index);
+ ifp = if_lookup_by_index_per_ns(zns,
+ ifi->ifi_index);
if (ifp && strcmp(ifp->name, name)) {
if (IS_ZEBRA_DEBUG_KERNEL)
zlog_debug(
- "interface index %d was renamed from %s to %s",
- ifp->ifindex, ifp->name, name);
+ "interface index %d was renamed from %s to %s",
+ ifp->ifindex, ifp->name,
+ name);
if_update_to_new_name(ifp, name);
} else
/* unknown interface */
If you are a new contributor to FRR, please see our contributing guidelines.
After making changes, you do not need to create a new PR. You should perform an amend or interactive rebase followed by a force push.
sorry for delay. I was away from keyboard for a few days. lets continue discussion on this ticket. |
3d6f254
to
9a1233a
Compare
Outdated results 💚Basic BGPD CI results: SUCCESS, 0 tests failed
For details, please contact louberger |
Continuous Integration Result: FAILEDContinuous Integration Result: FAILEDSee below for issues. This is a comment from an automated CI system. Get source / Pull Request: SuccessfulBuilding Stage: SuccessfulBasic Tests: FailedTopotests Ubuntu 18.04 i386 part 1: Failed (click for details)Topology Test Results are at https://ci1.netdef.org/browse/FRR-FRRPULLREQ-TOPO1U18I386-20197/test Topology Tests failed for Topotests Ubuntu 18.04 i386 part 1:
see full log at https://ci1.netdef.org/browse/FRR-FRRPULLREQ-20197/artifact/TOPO1U18I386/ErrorLog/log_topotests.txt Successful on other platforms/tests
Warnings Generated during build:Checkout code: Successful with additional warningsTopotests Ubuntu 18.04 i386 part 1: Failed (click for details)Topology Test Results are at https://ci1.netdef.org/browse/FRR-FRRPULLREQ-TOPO1U18I386-20197/test Topology Tests failed for Topotests Ubuntu 18.04 i386 part 1:
see full log at https://ci1.netdef.org/browse/FRR-FRRPULLREQ-20197/artifact/TOPO1U18I386/ErrorLog/log_topotests.txt
|
9a1233a
to
d0b43e9
Compare
Outdated results 💚Basic BGPD CI results: SUCCESS, 0 tests failed
For details, please contact louberger |
Continuous Integration Result: SUCCESSFULContinuous Integration Result: SUCCESSFULCongratulations, this patch passed basic tests Tested-by: NetDEF / OpenSourceRouting.org CI System CI System Testrun URL: https://ci1.netdef.org/browse/FRR-FRRPULLREQ-20218/ This is a comment from an automated CI system. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Given more thought about this, I don't think that we should handle the interface rename any different from how it's done now. From the FRR perspective, the interface rename is the deletion of the old interface and the addition of a new one. We may already have a config for the new name – and in this case, we apply this config. If we don't have a config then it's just a new clean interface and this is also correct.
We can not change NB config in case of rename – the config must not depend on any operational changes.
lib/if.c
Outdated
snprintf(newpath, sizeof(newpath), | ||
"/frr-interface:lib/interface[name='%s'][vrf='%s']", | ||
name, vrf->name); | ||
if_dnode = yang_dnode_getf(running_config->dnode, "%s/vrf", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You're not changing the name here, you're changing the VRF. More than that, oldpath and newpath are the same because you're using ifp->name
and name
to distinguish and they are the same after the call to if_set_name
at the beginning of the function. Is this code tested?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tested it like below:
- vtysh
config terminal
password zebra
interface dum3
ip address 4.4.4.4/24
end
- check yang context: ( telnet 0 2601, then enable)
#show yang operational-context /frr-interface:lib
- shell creation
ip link add dum3 type dummy
ip link set dev dum3 up
- interface rename
ip link set dev dum3 down
ip link set dev dum3 name dum4
- check: show yang operational
dut-vm# show yang operational-data /frr-interface:lib
{
"frr-interface:lib": {
"interface": [
{
"name": "dum4",
"vrf": "default",
"state": {
"if-index": 6,
"mtu": 1500,
"mtu6": 1500,
"speed": 0,
"metric": 0,
"phy-address": "5e:37:28:23:c0:d6"
},
"frr-zebra:zebra": {
"state": {
"up-count": 1,
"down-count": 1,
"ptm-status": "disabled"
}
}
},
I agree with the code done, there seems to be something wrong. however, I can see the operation context ok.
is there something else I should use ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Operational context is just a representation of the daemon's internal state.
You should use show conf run json zebra
(or other daemon) to check the daemon's NB config.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
updated with appropriate code ( modifying interface name instead of vrf name).
and checked that with traces I enter in the change_leaf code.
This is where I try to fix the ipv6 address lost on frr config. What also pushes me to use rename is that we know we have problems with "rename" ( see https://github.com/opensourcerouting/frr/wiki/Advanced-topics#configuration-changes-coming-from-the-kernel ). If you say we should continue create a new copy, then you should explain me how to resync the kernel config with the new interface config. If you dont know, I would like to share this with a second pair of eyes. |
I definitely agree we should discuss this with a wider audience because we're going in circles here :) |
d0b43e9
to
f072139
Compare
I want to solve issue described in #8256. |
Outdated results 💚Basic BGPD CI results: SUCCESS, 0 tests failed
For details, please contact louberger |
Continuous Integration Result: FAILEDContinuous Integration Result: FAILEDSee below for issues. This is a comment from an automated CI system. Get source / Pull Request: SuccessfulBuilding Stage: SuccessfulBasic Tests: FailedTopotests Ubuntu 18.04 amd64 part 6: Failed (click for details)Topology Test Results are at https://ci1.netdef.org/browse/FRR-FRRPULLREQ-TOPO6U18AMD64-20295/test Topology Tests failed for Topotests Ubuntu 18.04 amd64 part 6:
see full log at https://ci1.netdef.org/browse/FRR-FRRPULLREQ-20295/artifact/TOPO6U18AMD64/ErrorLog/log_topotests.txt Successful on other platforms/tests
|
ci:rerun |
Continuous Integration Result: SUCCESSFULContinuous Integration Result: SUCCESSFULCongratulations, this patch passed basic tests Tested-by: NetDEF / OpenSourceRouting.org CI System CI System Testrun URL: https://ci1.netdef.org/browse/FRR-FRRPULLREQ-20310/ This is a comment from an automated CI system. |
Ok, I understand the problem, but this is definitely not the right way to solve it.
You should find the real reason why we don't receive the information about IPv6 addresses for the new name. |
note that I fixes the
ok, I managed to run the test and use 'show conf run json zebra', and I could see that the configuration aligns with the new interface name.
From netlink perspective, the ipv6 address initially set on interface is kept on the new interface (thanks to keep_addr_on_down settings). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The code doesn't work correctly when we already have the config for the new name:
zebra[25679]: [J1C1S-S285Q][EC 100663305] if_set_name(after): corruption detected -- interface with this name exists already in VRF 0!
We end up having two interface contexts with the same interface name.
when an interface is renamed in a given namespace, the yang operational context is not refreshed with the new interface name. For instance, following config can show the problem: -step 0 : setup zebra & ip link add dum1 type dummy ip link set dev dum1 up vtysh configure terminal interface dum1 ip address 4.4.4.4/24 exit exit -step 1 : rename ip link set dev dum1 down ip link set dev dum1 name dum2 ip link set dev dum2 up This scenario, from netlink perspective is processed as NETLINK_ADD event. For each new interface, a new interface context is created. Two consequences: - the interface attributes of the 1st interface may be lost. this is the case for ipv6 link local addresses. - the yang operational context does not track the interface context and sticks to the 1st interface configuration. The proposal consists in reusing the previous interface, and renaming the interface name. Also, update the yang operational context by using the new interface name. For operational contexts of daemons other than zebra, two events are seen: delete old interface, and create new interface. so operational contexts from all daemons are synced. Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
f072139
to
bf17db3
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for your contribution to FRR!
Click for style suggestions
To apply these suggestions:
curl -s https://gist.githubusercontent.com/polychaeta/dc70002904ec658203ead53bcc63dc14/raw/6d3cc0fc2fb25ffb06ee33c53ea202f5c039631d/cr_8310_1626697058.diff | git apply
diff --git a/zebra/if_netlink.c b/zebra/if_netlink.c
index 1a3279168..c72cbf006 100644
--- a/zebra/if_netlink.c
+++ b/zebra/if_netlink.c
@@ -1598,13 +1598,15 @@ int netlink_link_change(struct nlmsghdr *h, ns_id_t ns_id, int startup)
if (strcmp(ifp->name, name)) {
if (IS_ZEBRA_DEBUG_KERNEL)
zlog_debug(
- "interface index %d was renamed from %s to %s",
- ifp->ifindex, ifp->name,
- name);
- if_update_to_new_name(ifp, name);
+ "interface index %d was renamed from %s to %s",
+ ifp->ifindex,
+ ifp->name,
+ name);
+ if_update_to_new_name(ifp,
+ name);
}
- /* else interface with same name already present
- * reuse ifp pointer
+ /* else interface with same name already
+ * present reuse ifp pointer
*/
} else
/* unknown interface */
If you are a new contributor to FRR, please see our contributing guidelines.
After making changes, you do not need to create a new PR. You should perform an amend or interactive rebase followed by a force push.
💚 Basic BGPD CI results: SUCCESS, 0 tests failedResults table
For details, please contact louberger |
Continuous Integration Result: SUCCESSFULCongratulations, this patch passed basic tests Tested-by: NetDEF / OpenSourceRouting.org CI System CI System Testrun URL: https://ci1.netdef.org/browse/FRR-FRRPULLREQ-20353/ This is a comment from an automated CI system. Warnings Generated during build:Checkout code: Successful with additional warnings
|
with last commit, there should not be corruption in list:
|
The log I provided is from the actual testing of your latest code. I am really concerned that I have to spend so much time on this. Please, thoroughly test your code for both situations – when we have and don't have the config for the new name. The first case doesn't work right now. If you manage to fix this case when we already have a fake interface context, this whole absolutely incorrect thing with renaming the NB config won't be needed at all. |
This pull request has conflicts, please resolve those before we can evaluate the pull request. |
Closing PR hasn't been touched in 2 years |
when an interface is renamed in a given namespace, the ipv6
addresses are maintained in the new interface. The fix consists
in using the same interface context, instead of creating a new
one.
Signed-off-by: Philippe Guibert philippe.guibert@6wind.com