Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

zebra: handle renamed interfaces at netlink level #8310

Closed

Conversation

pguibert6WIND
Copy link
Member

when an interface is renamed in a given namespace, the ipv6
addresses are maintained in the new interface. The fix consists
in using the same interface context, instead of creating a new
one.

Signed-off-by: Philippe Guibert philippe.guibert@6wind.com

@pguibert6WIND
Copy link
Member Author

#8256

@qlyoung
Copy link
Member

qlyoung commented Mar 22, 2021

@polychaeta rereview

Copy link

@polychaeta polychaeta left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your contribution to FRR!

Click for style suggestions

To apply these suggestions:

curl -s https://gist.githubusercontent.com/polychaeta/1e285fcedee7a15a19bc8d83269ad6cf/raw/a0ab94368a891b2ec4f7626e3b4840670b205e8c/cr_8310_1616446940.diff | git apply

diff --git a/zebra/if_netlink.c b/zebra/if_netlink.c
index a7e32397d..3b3961d6a 100644
--- a/zebra/if_netlink.c
+++ b/zebra/if_netlink.c
@@ -107,14 +107,13 @@ static void set_if_change(struct interface *ifp, ifindex_t ifi_index,
 	if (((oifp = if_lookup_by_name_per_ns(zns, if_name)) != NULL)
 	    && (oifp != ifp)) {
 		if (IS_ZEBRA_DEBUG_KERNEL)
-			zlog_debug(
-				   "interface name %s renamed to %s",
+			zlog_debug("interface name %s renamed to %s",
 				   oifp->name, ifp->name);
 		if (if_is_up(oifp))
 			flog_err(
-				 EC_LIB_INTERFACE,
-				 "interface rename detected on up interface from %s to %s, results are uncertain!",
-				 oifp->name, ifp->name);
+				EC_LIB_INTERFACE,
+				"interface rename detected on up interface from %s to %s, results are uncertain!",
+				oifp->name, ifp->name);
 		update = true;
 	}
 	if_set_name(ifp, if_name);

If you are a new contributor to FRR, please see our contributing guidelines.

After making changes, you do not need to create a new PR. You should perform an amend or interactive rebase followed by a force push.

@qlyoung qlyoung added the zebra label Mar 22, 2021
@NetDEF-CI
Copy link
Collaborator

NetDEF-CI commented Mar 23, 2021

Continuous Integration Result: SUCCESSFUL

Continuous Integration Result: SUCCESSFUL

Congratulations, this patch passed basic tests

Tested-by: NetDEF / OpenSourceRouting.org CI System

CI System Testrun URL: https://ci1.netdef.org/browse/FRR-FRRPULLREQ-17894/

This is a comment from an automated CI system.
For questions and feedback in regards to this CI system, please feel free to email
Martin Winter - mwinter (at) opensourcerouting.org.

Warnings Generated during build:

Checkout code: Successful with additional warnings
Report for if_netlink.c | 2 issues
===============================================
< ERROR: do not use assignment in if condition
< #107: FILE: /tmp/f1-20854/if_netlink.c:107:

@LabN-CI
Copy link
Collaborator

LabN-CI commented Mar 23, 2021

Outdated results 💚

Basic BGPD CI results: SUCCESS, 0 tests failed

_ _
Result 0
Date 0
Start 0
Finish vncregress-2021-03-22-21:29:15.txt
Run-Time autoscript-2021-03-22-21:30:27.log.bz2
Total 499 490 428
Pass Complete
Fail 03/16/2021
Valgrind-Errors 21:29:10
Valgrind-Loss 22:06:48
Details 37:38
Log 1815
Memory 1815
SUCCESS git merge/8310 3133551 0
03/22/2021 0
21:29:15 0
22:08:32 autoscript-2021-03-16-21:29:10.txt
39:17 autoscript-2021-03-16-21:29:10.log.bz2
1815 501 505 419
1815

For details, please contact louberger

@NetDEF-CI
Copy link
Collaborator

NetDEF-CI commented Mar 23, 2021

Continuous Integration Result: SUCCESSFUL

Continuous Integration Result: SUCCESSFUL

Congratulations, this patch passed basic tests

Tested-by: NetDEF / OpenSourceRouting.org CI System

CI System Testrun URL: https://ci1.netdef.org/browse/FRR-FRRPULLREQ-17915/

This is a comment from an automated CI system.
For questions and feedback in regards to this CI system, please feel free to email
Martin Winter - mwinter (at) opensourcerouting.org.

Warnings Generated during build:

Checkout code: Successful with additional warnings
Report for if_netlink.c | 2 issues
===============================================
< ERROR: do not use assignment in if condition
< #107: FILE: /tmp/f1-32314/if_netlink.c:107:

@LabN-CI
Copy link
Collaborator

LabN-CI commented Mar 23, 2021

Outdated results 💚

Basic BGPD CI results: SUCCESS, 0 tests failed

_ _
Result SUCCESS git merge/8310 2c6b911
Date 03/23/2021
Start 14:24:18
Finish 15:03:46
Run-Time 39:28
Total 1815
Pass 1815
Fail 0
Valgrind-Errors 0
Valgrind-Loss 0
Details vncregress-2021-03-23-14:24:18.txt
Log autoscript-2021-03-23-14:25:27.log.bz2
Memory 506 508 431

For details, please contact louberger

@LabN-CI
Copy link
Collaborator

LabN-CI commented Mar 24, 2021

Outdated results 💚

Basic BGPD CI results: SUCCESS, 0 tests failed

_ _
Result SUCCESS git merge/8310 cf2de59
Date 03/24/2021
Start 08:45:43
Finish 09:25:17
Run-Time 39:34
Total 1815
Pass 1815
Fail 0
Valgrind-Errors 0
Valgrind-Loss 0
Details vncregress-2021-03-24-08:45:43.txt
Log autoscript-2021-03-24-08:46:54.log.bz2
Memory 493 486 430

For details, please contact louberger

@NetDEF-CI
Copy link
Collaborator

NetDEF-CI commented Mar 24, 2021

Continuous Integration Result: SUCCESSFUL

Continuous Integration Result: SUCCESSFUL

Congratulations, this patch passed basic tests

Tested-by: NetDEF / OpenSourceRouting.org CI System

CI System Testrun URL: https://ci1.netdef.org/browse/FRR-FRRPULLREQ-17951/

This is a comment from an automated CI system.
For questions and feedback in regards to this CI system, please feel free to email
Martin Winter - mwinter (at) opensourcerouting.org.

@vjardin
Copy link
Contributor

vjardin commented May 26, 2021

LGTM,

@louis-6wind
Copy link
Contributor

Hello @vjardin, could you merge this PR please ?

Copy link
Contributor

@idryzhov idryzhov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I left a couple of comments, but I also don't understand what is wrong with the current code.

Here is the current code flow in netlink_link_change when the interface is renamed:

  • ifp = if_lookup_by_name_per_ns(zns, name);
    Lookup for an existing interface by name. It either returns NULL or a pointer to an existing inactive interface with an unknown ifindex.
  • if (ifp == NULL) ifp = if_get_by_name(name, vrf_id);
    If there's no inactive interface found on the first step, the new one is created.
  • set_ifindex(ifp, ifi->ifi_index, zns);
    • First, this function tries to find an existing interface with the same ifindex. If this is a rename, then oifp is a pointer to an old interface.
    • Then it makes it inactive (resetting the ifindex) by calling if_delete_update.
    • And then it assigns the ifindex to a newly created interface by calling to if_set_index.

The current flow looks completely correct to me. Please, explain what is wrong with it.

}
}
if_set_index(ifp, ifi_index);

oifp = if_lookup_by_name_per_ns(zns, if_name);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here you rewrite the oifp pointer that was obtained by if_lookup_by_index_per_ns earlier.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this change of code was useless. youre right.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@idryzhov , is it ok for you ?

@pguibert6WIND pguibert6WIND force-pushed the if_rename_index_ipv6 branch from cf2de59 to 931aa95 Compare June 25, 2021 15:44
@pguibert6WIND
Copy link
Member Author

I kept only the code that searches for original interface .

@LabN-CI
Copy link
Collaborator

LabN-CI commented Jun 25, 2021

Outdated results 💚

Basic BGPD CI results: SUCCESS, 0 tests failed

_ _
Result SUCCESS git merge/8310 931aa95
Date 06/25/2021
Start 11:45:37
Finish 12:11:06
Run-Time 25:29
Total 1815
Pass 1815
Fail 0
Valgrind-Errors
Valgrind-Loss
Details vncregress-2021-06-25-11:45:37.txt
Log autoscript-2021-06-25-11:46:53.log.bz2
Memory 514 510 429

For details, please contact louberger

@NetDEF-CI
Copy link
Collaborator

NetDEF-CI commented Jun 25, 2021

Continuous Integration Result: SUCCESSFUL

Continuous Integration Result: SUCCESSFUL

Congratulations, this patch passed basic tests

Tested-by: NetDEF / OpenSourceRouting.org CI System

CI System Testrun URL: https://ci1.netdef.org/browse/FRR-FRRPULLREQ-19848/

This is a comment from an automated CI system.
For questions and feedback in regards to this CI system, please feel free to email
Martin Winter - mwinter (at) opensourcerouting.org.

@pguibert6WIND
Copy link
Member Author

@idryzhov , I simplfied the code change.
are you okay with that change?

Copy link
Contributor

@idryzhov idryzhov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I still don't think it's the right fix.

You don't change the code path for the case when we already have a config for the new name (!CHECK_FLAG(ifp->status, ZEBRA_INTERFACE_ACTIVE)). So whatever you're trying to fix won't be fixed in this case.
For the case when we don't have a config for the new name, you don't update the name of the old interface object. This is also not correct.

Overall this change makes the code behavior inconsistent. Sometimes we create a new object for a new name and sometimes we don't. The current code is consistent and always creates a new object.

@pguibert6WIND pguibert6WIND force-pushed the if_rename_index_ipv6 branch from 931aa95 to 3d6f254 Compare July 12, 2021 15:11
Copy link

@polychaeta polychaeta left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your contribution to FRR!

Click for style suggestions

To apply these suggestions:

curl -s https://gist.githubusercontent.com/polychaeta/0dd38667e025df3f9410fb9eb16f04b8/raw/4e4e8b4772f7c71a54048a2d791a768127e8ee84/cr_8310_1626102668.diff | git apply

diff --git a/lib/if.c b/lib/if.c
index 5707cf238..7db14ec19 100644
--- a/lib/if.c
+++ b/lib/if.c
@@ -232,8 +232,7 @@ struct interface *if_create_ifindex(ifindex_t ifindex, vrf_id_t vrf_id)
 }
 
 /* Create new interface structure. */
-extern void if_update_to_new_name(struct interface *ifp,
-				  const char *name)
+extern void if_update_to_new_name(struct interface *ifp, const char *name)
 {
 	struct vrf *vrf = vrf_get(ifp->vrf_id, NULL);
 	assert(vrf);
diff --git a/lib/if.h b/lib/if.h
index 647150ae0..f2e751c36 100644
--- a/lib/if.h
+++ b/lib/if.h
@@ -509,8 +509,7 @@ extern int if_cmp_name_func(const char *p1, const char *p2);
  * else think before you use VRF_UNKNOWN
  */
 extern void if_update_to_new_vrf(struct interface *, vrf_id_t vrf_id);
-extern void if_update_to_new_name(struct interface *ifp,
-				  const char *name);
+extern void if_update_to_new_name(struct interface *ifp, const char *name);
 
 /* Create new interface, adds to name list only */
 extern struct interface *if_create_name(const char *name, vrf_id_t vrf_id);
diff --git a/zebra/if_netlink.c b/zebra/if_netlink.c
index 8b50c1aa9..4fe0094c0 100644
--- a/zebra/if_netlink.c
+++ b/zebra/if_netlink.c
@@ -1592,12 +1592,14 @@ int netlink_link_change(struct nlmsghdr *h, ns_id_t ns_id, int startup)
 
 			if (ifp == NULL) {
 				/* check if interface with same index exists */
-				ifp = if_lookup_by_index_per_ns(zns, ifi->ifi_index);
+				ifp = if_lookup_by_index_per_ns(zns,
+								ifi->ifi_index);
 				if (ifp && strcmp(ifp->name, name)) {
 					if (IS_ZEBRA_DEBUG_KERNEL)
 						zlog_debug(
-							   "interface index %d was renamed from %s to %s",
-							   ifp->ifindex, ifp->name, name);
+							"interface index %d was renamed from %s to %s",
+							ifp->ifindex, ifp->name,
+							name);
 					if_update_to_new_name(ifp, name);
 				} else
 					/* unknown interface */

If you are a new contributor to FRR, please see our contributing guidelines.

After making changes, you do not need to create a new PR. You should perform an amend or interactive rebase followed by a force push.

@pguibert6WIND
Copy link
Member Author

I still don't think it's the right fix.

You don't change the code path for the case when we already have a config for the new name (!CHECK_FLAG(ifp->status, ZEBRA_INTERFACE_ACTIVE)). So whatever you're trying to fix won't be fixed in this case.
For the case when we don't have a config for the new name, you don't update the name of the old interface object. This is also not correct.

Overall this change makes the code behavior inconsistent. Sometimes we create a new object for a new name and sometimes we don't. The current code is consistent and always creates a new object.

sorry for delay. I was away from keyboard for a few days.
I tried to address the issue of the code path change in nb api.
I try to reuse some code from 763725c

lets continue discussion on this ticket.

@pguibert6WIND pguibert6WIND force-pushed the if_rename_index_ipv6 branch from 3d6f254 to 9a1233a Compare July 12, 2021 15:15
@LabN-CI
Copy link
Collaborator

LabN-CI commented Jul 12, 2021

Outdated results 💚

Basic BGPD CI results: SUCCESS, 0 tests failed

_ _
Result SUCCESS git merge/8310 9a1233a
Date 07/12/2021
Start 11:15:42
Finish 11:41:06
Run-Time 25:24
Total 1815
Pass 1815
Fail 0
Valgrind-Errors
Valgrind-Loss
Details vncregress-2021-07-12-11:15:42.txt
Log autoscript-2021-07-12-11:16:56.log.bz2
Memory 486 513 426

For details, please contact louberger

@NetDEF-CI
Copy link
Collaborator

NetDEF-CI commented Jul 13, 2021

Continuous Integration Result: FAILED

Continuous Integration Result: FAILED

See below for issues.
CI System Testrun URL: https://ci1.netdef.org/browse/FRR-FRRPULLREQ-20197/

This is a comment from an automated CI system.
For questions and feedback in regards to this CI system, please feel free to email
Martin Winter - mwinter (at) opensourcerouting.org.

Get source / Pull Request: Successful

Building Stage: Successful

Basic Tests: Failed

Topotests Ubuntu 18.04 i386 part 1: Failed (click for details)

Topology Test Results are at https://ci1.netdef.org/browse/FRR-FRRPULLREQ-TOPO1U18I386-20197/test

Topology Tests failed for Topotests Ubuntu 18.04 i386 part 1:

2021-07-13 03:13:49,289 ERROR: 'router_json_cmp' failed after 133.29 seconds
2021-07-13 03:13:49,291 ERROR: assert failed at "bfd_topo2.test_bfd_topo2/test_protocols_convergence": "r4" JSON output mismatches
assert Generated JSON diff error report:
  
  > $: d2 has key '2001:db8:1::/64' which is not present in d1
2021-07-13 03:44:08,166 ERROR: Traceback (most recent call last):
  File "/root/bamboo-agent-home/xml-data/build-dir/FRR-FRRPULLREQ-TOPO1U18I386/topotests/lib/bgp.py", line 203, in create_router_bgp
    tgen, router, data_all_bgp, "bgp", build, load_config
  File "/root/bamboo-agent-home/xml-data/build-dir/FRR-FRRPULLREQ-TOPO1U18I386/topotests/lib/common_config.py", line 363, in create_common_configuration
    load_config_to_router(tgen, router)
  File "/root/bamboo-agent-home/xml-data/build-dir/FRR-FRRPULLREQ-TOPO1U18I386/topotests/lib/common_config.py", line 617, in load_config_to_router
    raise InvalidCLIError("%s" % output)
InvalidCLIError: line 6: % Unknown command[27]: neighbor 10.0.0.13 remote-as 0 
% Specify remote-as or peer-group commands first
line 7: Failure to communicate[13] to bgpd, line: neighbor 10.0.0.13 timers 3 10 

line 9: % Unknown command[30]: neighbor fd00:0:0:3::1 remote-as 0 
% Specify remote-as or peer-group commands first
line 11: Failure to communicate[13] to bgpd, line: neighbor fd00:0:0:3::1 activate 

% Specify remote-as or peer-group commands first
line 12: Failure to communicate[13] to bgpd, line: neighbor fd00:0:0:3::1 timers 3 10 

% Specify remote-as or peer-group commands first
line 14: Failure to communicate[13] to bgpd, line: no neighbor fd00:0:0:3::1 activate 



2021-07-13 03:44:08,794 ERROR: Traceback (most recent call last):
  File "/root/bamboo-agent-home/xml-data/build-dir/FRR-FRRPULLREQ-TOPO1U18I386/topotests/lib/bgp.py", line 203, in create_router_bgp
    tgen, router, data_all_bgp, "bgp", build, load_config
  File "/root/bamboo-agent-home/xml-data/build-dir/FRR-FRRPULLREQ-TOPO1U18I386/topotests/lib/common_config.py", line 363, in create_common_configuration
    load_config_to_router(tgen, router)
  File "/root/bamboo-agent-home/xml-data/build-dir/FRR-FRRPULLREQ-TOPO1U18I386/topotests/lib/common_config.py", line 617, in load_config_to_router
    raise InvalidCLIError("%s" % output)
InvalidCLIError: % No BGP process is configured
line 2: Failure to communicate[13] to bgpd, line: no router bgp  

see full log at https://ci1.netdef.org/browse/FRR-FRRPULLREQ-20197/artifact/TOPO1U18I386/ErrorLog/log_topotests.txt

Successful on other platforms/tests
  • Topotests Ubuntu 18.04 i386 part 2
  • Ubuntu 18.04 deb pkg check
  • Ubuntu 20.04 deb pkg check
  • Topotests Ubuntu 18.04 i386 part 7
  • Topotests debian 10 amd64 part 1
  • Debian 9 deb pkg check
  • Addresssanitizer topotests part 4
  • Topotests Ubuntu 18.04 arm8 part 7
  • Topotests Ubuntu 18.04 amd64 part 1
  • Topotests debian 10 amd64 part 6
  • Debian 10 deb pkg check
  • CentOS 7 rpm pkg check
  • Topotests Ubuntu 18.04 amd64 part 8
  • Topotests Ubuntu 18.04 arm8 part 2
  • Topotests Ubuntu 18.04 amd64 part 2
  • Addresssanitizer topotests part 2
  • Topotests Ubuntu 18.04 arm8 part 3
  • Topotests debian 10 amd64 part 0
  • Topotests Ubuntu 18.04 i386 part 8
  • Topotests Ubuntu 18.04 i386 part 3
  • Addresssanitizer topotests part 9
  • Addresssanitizer topotests part 3
  • Topotests debian 10 amd64 part 3
  • Ubuntu 16.04 deb pkg check
  • Addresssanitizer topotests part 5
  • Addresssanitizer topotests part 7
  • Topotests Ubuntu 18.04 i386 part 6
  • Topotests Ubuntu 18.04 arm8 part 8
  • Topotests debian 10 amd64 part 2
  • Topotests Ubuntu 18.04 amd64 part 6
  • Topotests Ubuntu 18.04 amd64 part 4
  • Topotests Ubuntu 18.04 arm8 part 5
  • Topotests Ubuntu 18.04 arm8 part 0
  • Addresssanitizer topotests part 0
  • IPv4 ldp protocol on Ubuntu 18.04
  • Topotests debian 10 amd64 part 4
  • Static analyzer (clang)
  • Topotests Ubuntu 18.04 i386 part 0
  • Topotests debian 10 amd64 part 9
  • Topotests Ubuntu 18.04 i386 part 5
  • IPv6 protocols on Ubuntu 18.04
  • Topotests Ubuntu 18.04 amd64 part 0
  • Addresssanitizer topotests part 1
  • Topotests Ubuntu 18.04 amd64 part 3
  • Topotests Ubuntu 18.04 i386 part 9
  • Topotests Ubuntu 18.04 i386 part 4
  • Addresssanitizer topotests part 8
  • Topotests debian 10 amd64 part 5
  • Topotests Ubuntu 18.04 arm8 part 6
  • IPv4 protocols on Ubuntu 18.04
  • Topotests Ubuntu 18.04 arm8 part 1
  • Topotests Ubuntu 18.04 arm8 part 4
  • Topotests Ubuntu 18.04 arm8 part 9
  • Topotests Ubuntu 18.04 amd64 part 7
  • Topotests debian 10 amd64 part 8
  • Topotests debian 10 amd64 part 7
  • Topotests Ubuntu 18.04 amd64 part 9
  • Topotests Ubuntu 18.04 amd64 part 5
  • Fedora 29 rpm pkg check
  • Addresssanitizer topotests part 6

Warnings Generated during build:

Checkout code: Successful with additional warnings
Topotests Ubuntu 18.04 i386 part 1: Failed (click for details)

Topology Test Results are at https://ci1.netdef.org/browse/FRR-FRRPULLREQ-TOPO1U18I386-20197/test

Topology Tests failed for Topotests Ubuntu 18.04 i386 part 1:

2021-07-13 03:13:49,289 ERROR: 'router_json_cmp' failed after 133.29 seconds
2021-07-13 03:13:49,291 ERROR: assert failed at "bfd_topo2.test_bfd_topo2/test_protocols_convergence": "r4" JSON output mismatches
assert Generated JSON diff error report:
  
  > $: d2 has key '2001:db8:1::/64' which is not present in d1
2021-07-13 03:44:08,166 ERROR: Traceback (most recent call last):
  File "/root/bamboo-agent-home/xml-data/build-dir/FRR-FRRPULLREQ-TOPO1U18I386/topotests/lib/bgp.py", line 203, in create_router_bgp
    tgen, router, data_all_bgp, "bgp", build, load_config
  File "/root/bamboo-agent-home/xml-data/build-dir/FRR-FRRPULLREQ-TOPO1U18I386/topotests/lib/common_config.py", line 363, in create_common_configuration
    load_config_to_router(tgen, router)
  File "/root/bamboo-agent-home/xml-data/build-dir/FRR-FRRPULLREQ-TOPO1U18I386/topotests/lib/common_config.py", line 617, in load_config_to_router
    raise InvalidCLIError("%s" % output)
InvalidCLIError: line 6: % Unknown command[27]: neighbor 10.0.0.13 remote-as 0 
% Specify remote-as or peer-group commands first
line 7: Failure to communicate[13] to bgpd, line: neighbor 10.0.0.13 timers 3 10 

line 9: % Unknown command[30]: neighbor fd00:0:0:3::1 remote-as 0 
% Specify remote-as or peer-group commands first
line 11: Failure to communicate[13] to bgpd, line: neighbor fd00:0:0:3::1 activate 

% Specify remote-as or peer-group commands first
line 12: Failure to communicate[13] to bgpd, line: neighbor fd00:0:0:3::1 timers 3 10 

% Specify remote-as or peer-group commands first
line 14: Failure to communicate[13] to bgpd, line: no neighbor fd00:0:0:3::1 activate 



2021-07-13 03:44:08,794 ERROR: Traceback (most recent call last):
  File "/root/bamboo-agent-home/xml-data/build-dir/FRR-FRRPULLREQ-TOPO1U18I386/topotests/lib/bgp.py", line 203, in create_router_bgp
    tgen, router, data_all_bgp, "bgp", build, load_config
  File "/root/bamboo-agent-home/xml-data/build-dir/FRR-FRRPULLREQ-TOPO1U18I386/topotests/lib/common_config.py", line 363, in create_common_configuration
    load_config_to_router(tgen, router)
  File "/root/bamboo-agent-home/xml-data/build-dir/FRR-FRRPULLREQ-TOPO1U18I386/topotests/lib/common_config.py", line 617, in load_config_to_router
    raise InvalidCLIError("%s" % output)
InvalidCLIError: % No BGP process is configured
line 2: Failure to communicate[13] to bgpd, line: no router bgp  

see full log at https://ci1.netdef.org/browse/FRR-FRRPULLREQ-20197/artifact/TOPO1U18I386/ErrorLog/log_topotests.txt

Report for if.c | 2 issues
===============================================
< WARNING: Missing a blank line after declarations
< #238: FILE: /tmp/f1-15858/if.c:238:

@pguibert6WIND pguibert6WIND force-pushed the if_rename_index_ipv6 branch from 9a1233a to d0b43e9 Compare July 13, 2021 06:58
@LabN-CI
Copy link
Collaborator

LabN-CI commented Jul 13, 2021

Outdated results 💚

Basic BGPD CI results: SUCCESS, 0 tests failed

_ _
Result SUCCESS git merge/8310 d0b43e9
Date 07/13/2021
Start 03:00:44
Finish 03:26:19
Run-Time 25:35
Total 1815
Pass 1815
Fail 0
Valgrind-Errors
Valgrind-Loss
Details vncregress-2021-07-13-03:00:44.txt
Log autoscript-2021-07-13-03:02:02.log.bz2
Memory 459 502 417

For details, please contact louberger

@NetDEF-CI
Copy link
Collaborator

NetDEF-CI commented Jul 13, 2021

Continuous Integration Result: SUCCESSFUL

Continuous Integration Result: SUCCESSFUL

Congratulations, this patch passed basic tests

Tested-by: NetDEF / OpenSourceRouting.org CI System

CI System Testrun URL: https://ci1.netdef.org/browse/FRR-FRRPULLREQ-20218/

This is a comment from an automated CI system.
For questions and feedback in regards to this CI system, please feel free to email
Martin Winter - mwinter (at) opensourcerouting.org.

Copy link
Contributor

@idryzhov idryzhov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given more thought about this, I don't think that we should handle the interface rename any different from how it's done now. From the FRR perspective, the interface rename is the deletion of the old interface and the addition of a new one. We may already have a config for the new name – and in this case, we apply this config. If we don't have a config then it's just a new clean interface and this is also correct.
We can not change NB config in case of rename – the config must not depend on any operational changes.

lib/if.c Outdated
snprintf(newpath, sizeof(newpath),
"/frr-interface:lib/interface[name='%s'][vrf='%s']",
name, vrf->name);
if_dnode = yang_dnode_getf(running_config->dnode, "%s/vrf",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're not changing the name here, you're changing the VRF. More than that, oldpath and newpath are the same because you're using ifp->name and name to distinguish and they are the same after the call to if_set_name at the beginning of the function. Is this code tested?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tested it like below:

  1. vtysh
config terminal
password zebra
interface dum3
ip address 4.4.4.4/24
end
  1. check yang context: ( telnet 0 2601, then enable)
#show yang operational-context /frr-interface:lib
  1. shell creation
ip link add dum3 type dummy
ip link set dev dum3 up
  1. interface rename
ip link set dev dum3 down
ip link set dev dum3 name dum4
  1. check: show yang operational
dut-vm# show yang operational-data /frr-interface:lib
{
  "frr-interface:lib": {
    "interface": [
      { 
        "name": "dum4",
        "vrf": "default",
        "state": {
          "if-index": 6,
          "mtu": 1500,
          "mtu6": 1500,
          "speed": 0,
          "metric": 0,
          "phy-address": "5e:37:28:23:c0:d6"
        },
        "frr-zebra:zebra": {
          "state": {
            "up-count": 1,
            "down-count": 1,
            "ptm-status": "disabled"
          }
        }
      },

I agree with the code done, there seems to be something wrong. however, I can see the operation context ok.
is there something else I should use ?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Operational context is just a representation of the daemon's internal state.
You should use show conf run json zebra (or other daemon) to check the daemon's NB config.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

updated with appropriate code ( modifying interface name instead of vrf name).
and checked that with traces I enter in the change_leaf code.

@pguibert6WIND
Copy link
Member Author

Given more thought about this, I don't think that we should handle the interface rename any different from how it's done now. From the FRR perspective, the interface rename is the deletion of the old interface and the addition of a new one. We may already have a config for the new name – and in this case, we apply this config. If we don't have a config then it's just a new clean interface and this is also correct.
We can not change NB config in case of rename – the config must not depend on any operational changes.

This is where I try to fix the ipv6 address lost on frr config.
On this situation If I continue removing old interface, on zebra, then the new interface is not aware of the presence of the ipv6 address. This is what pushes me to use rename.

What also pushes me to use rename is that we know we have problems with "rename" ( see https://github.com/opensourcerouting/frr/wiki/Advanced-topics#configuration-changes-coming-from-the-kernel ).

If you say we should continue create a new copy, then you should explain me how to resync the kernel config with the new interface config. If you dont know, I would like to share this with a second pair of eyes.

@idryzhov
Copy link
Contributor

idryzhov commented Jul 15, 2021

I definitely agree we should discuss this with a wider audience because we're going in circles here :)
Could you, please, explain a little bit more about the config being lost? Are there are IPv6 addresses configured outside of FRR and zebra doesn't add them to the new interface context? Or are there addresses configured inside FRR for the new interface name and this config is not being applied?

@pguibert6WIND pguibert6WIND force-pushed the if_rename_index_ipv6 branch from d0b43e9 to f072139 Compare July 15, 2021 19:35
@pguibert6WIND
Copy link
Member Author

I definitely agree we should discuss this with a wider audience because we're going in circles here :)
Could you, please, explain a little bit more about the config being lost? Are there are IPv6 addresses configured outside of FRR and zebra doesn't add them to the new interface context? Or are there addresses configured inside FRR for the new interface name and this config is not being applied?

I want to solve issue described in #8256.
This is not a FRR configuration issue. IPv6 config takes place outside of zebra.
IPv6 global addresses are not retransmitted to zebra once interface is renamed and moved to up.
The address was on the interface that has been deleted, and should be somehow copied to the new interface context.

@LabN-CI
Copy link
Collaborator

LabN-CI commented Jul 15, 2021

Outdated results 💚

Basic BGPD CI results: SUCCESS, 0 tests failed

_ _
Result SUCCESS git merge/8310 f072139
Date 07/15/2021
Start 15:40:42
Finish 16:06:15
Run-Time 25:33
Total 1813
Pass 1813
Fail 0
Valgrind-Errors
Valgrind-Loss
Details vncregress-2021-07-15-15:40:42.txt
Log autoscript-2021-07-15-15:41:59.log.bz2
Memory 493 478 424

For details, please contact louberger

@NetDEF-CI
Copy link
Collaborator

NetDEF-CI commented Jul 15, 2021

Continuous Integration Result: FAILED

Continuous Integration Result: FAILED

See below for issues.
CI System Testrun URL: https://ci1.netdef.org/browse/FRR-FRRPULLREQ-20295/

This is a comment from an automated CI system.
For questions and feedback in regards to this CI system, please feel free to email
Martin Winter - mwinter (at) opensourcerouting.org.

Get source / Pull Request: Successful

Building Stage: Successful

Basic Tests: Failed

Topotests Ubuntu 18.04 amd64 part 6: Failed (click for details)

Topology Test Results are at https://ci1.netdef.org/browse/FRR-FRRPULLREQ-TOPO6U18AMD64-20295/test

Topology Tests failed for Topotests Ubuntu 18.04 amd64 part 6:

2021-07-15 20:30:10,774 ERROR: ce2: bgpd left a dead pidfile (pid=22432)
2021-07-15 20:42:23,637 ERROR: 'router_json_cmp' failed after 139.08 seconds
2021-07-15 20:42:23,639 ERROR: assert failed at "bgp_vrf_lite_ipv6_rtadv.test_bgp_vrf_lite_ipv6_rtadv/test_protocols_convergence": "r1" JSON output mismatches
assert Generated JSON diff error report:
  
  > $->2001:db8:1::/64: d1 has Array of length 1 but in d2 it is of length 2
2021-07-15 20:42:35,509 ERROR: r1: bgpd left a dead pidfile (pid=28503)
*** defaultIntf: warning: r1 has no interfaces

see full log at https://ci1.netdef.org/browse/FRR-FRRPULLREQ-20295/artifact/TOPO6U18AMD64/ErrorLog/log_topotests.txt

Successful on other platforms/tests
  • Topotests Ubuntu 18.04 arm8 part 8
  • IPv4 protocols on Ubuntu 18.04
  • Topotests Ubuntu 18.04 arm8 part 6
  • Topotests Ubuntu 18.04 arm8 part 1
  • Addresssanitizer topotests part 8
  • Topotests debian 10 amd64 part 3
  • Topotests debian 10 amd64 part 8
  • Topotests debian 10 amd64 part 7
  • Ubuntu 16.04 deb pkg check
  • Topotests Ubuntu 18.04 i386 part 6
  • Addresssanitizer topotests part 6
  • Topotests debian 10 amd64 part 2
  • Topotests Ubuntu 18.04 amd64 part 5
  • Topotests Ubuntu 18.04 i386 part 1
  • IPv4 ldp protocol on Ubuntu 18.04
  • Ubuntu 20.04 deb pkg check
  • Topotests debian 10 amd64 part 1
  • Debian 9 deb pkg check
  • Topotests debian 10 amd64 part 9
  • Topotests Ubuntu 18.04 i386 part 0
  • Topotests Ubuntu 18.04 i386 part 5
  • Addresssanitizer topotests part 3
  • Debian 10 deb pkg check
  • Topotests Ubuntu 18.04 amd64 part 8
  • Topotests Ubuntu 18.04 arm8 part 7
  • Topotests Ubuntu 18.04 arm8 part 2
  • Addresssanitizer topotests part 2
  • Topotests Ubuntu 18.04 i386 part 3
  • Topotests Ubuntu 18.04 i386 part 8
  • Addresssanitizer topotests part 7
  • Topotests debian 10 amd64 part 0
  • Addresssanitizer topotests part 9
  • Topotests Ubuntu 18.04 arm8 part 9
  • Topotests Ubuntu 18.04 arm8 part 4
  • Topotests Ubuntu 18.04 amd64 part 7
  • Addresssanitizer topotests part 5
  • Topotests Ubuntu 18.04 amd64 part 9
  • Topotests Ubuntu 18.04 arm8 part 0
  • Fedora 29 rpm pkg check
  • Topotests Ubuntu 18.04 arm8 part 5
  • Ubuntu 18.04 deb pkg check
  • Topotests Ubuntu 18.04 i386 part 7
  • Topotests Ubuntu 18.04 i386 part 2
  • Topotests Ubuntu 18.04 amd64 part 4
  • Addresssanitizer topotests part 0
  • Addresssanitizer topotests part 4
  • Topotests Ubuntu 18.04 amd64 part 1
  • Topotests debian 10 amd64 part 4
  • Topotests debian 10 amd64 part 6
  • Static analyzer (clang)
  • Topotests Ubuntu 18.04 amd64 part 2
  • Topotests Ubuntu 18.04 amd64 part 0
  • Addresssanitizer topotests part 1
  • CentOS 7 rpm pkg check
  • Topotests Ubuntu 18.04 i386 part 4
  • IPv6 protocols on Ubuntu 18.04
  • Topotests Ubuntu 18.04 i386 part 9
  • Topotests Ubuntu 18.04 arm8 part 3
  • Topotests debian 10 amd64 part 5
  • Topotests Ubuntu 18.04 amd64 part 3

@pguibert6WIND
Copy link
Member Author

ci:rerun

@NetDEF-CI
Copy link
Collaborator

NetDEF-CI commented Jul 16, 2021

Continuous Integration Result: SUCCESSFUL

Continuous Integration Result: SUCCESSFUL

Congratulations, this patch passed basic tests

Tested-by: NetDEF / OpenSourceRouting.org CI System

CI System Testrun URL: https://ci1.netdef.org/browse/FRR-FRRPULLREQ-20310/

This is a comment from an automated CI system.
For questions and feedback in regards to this CI system, please feel free to email
Martin Winter - mwinter (at) opensourcerouting.org.

@idryzhov
Copy link
Contributor

I definitely agree we should discuss this with a wider audience because we're going in circles here :)
Could you, please, explain a little bit more about the config being lost? Are there are IPv6 addresses configured outside of FRR and zebra doesn't add them to the new interface context? Or are there addresses configured inside FRR for the new interface name and this config is not being applied?

I want to solve issue described in #8256.
This is not a FRR configuration issue. IPv6 config takes place outside of zebra.
IPv6 global addresses are not retransmitted to zebra once interface is renamed and moved to up.
The address was on the interface that has been deleted, and should be somehow copied to the new interface context.

Ok, I understand the problem, but this is definitely not the right way to solve it.
The code you wrote will work differently when we have or don't have the config for the new name:

  • when we don't have the config, your code fixes the issue by preserving the interface context
  • but when we have the config - the issue stays because nothing is changed

You should find the real reason why we don't receive the information about IPv6 addresses for the new name.
As you wrote in the linked issue – IPv4 works correctly, so this shouldn't be the overall issue, right? This should be something related only to IPv6. We can discuss this on a Tuesday meeting if you want.

@pguibert6WIND
Copy link
Member Author

I definitely agree we should discuss this with a wider audience because we're going in circles here :)
Could you, please, explain a little bit more about the config being lost? Are there are IPv6 addresses configured outside of FRR and zebra doesn't add them to the new interface context? Or are there addresses configured inside FRR for the new interface name and this config is not being applied?

I want to solve issue described in #8256.
This is not a FRR configuration issue. IPv6 config takes place outside of zebra.
IPv6 global addresses are not retransmitted to zebra once interface is renamed and moved to up.
The address was on the interface that has been deleted, and should be somehow copied to the new interface context.

Ok, I understand the problem, but this is definitely not the right way to solve it.
The code you wrote will work differently when we have or don't have the config for the new name:

* when we don't have the config, your code fixes the issue by preserving the interface context

* but when we have the config - the issue stays because nothing is changed

You should find the real reason why we don't receive the information about IPv6 addresses for the new name.
As you wrote in the linked issue – IPv4 works correctly, so this shouldn't be the overall issue, right? This should be something related only to IPv6. We can discuss this on a Tuesday meeting if you want.

note that I fixes the

I definitely agree we should discuss this with a wider audience because we're going in circles here :)
Could you, please, explain a little bit more about the config being lost? Are there are IPv6 addresses configured outside of FRR and zebra doesn't add them to the new interface context? Or are there addresses configured inside FRR for the new interface name and this config is not being applied?

I want to solve issue described in #8256.
This is not a FRR configuration issue. IPv6 config takes place outside of zebra.
IPv6 global addresses are not retransmitted to zebra once interface is renamed and moved to up.
The address was on the interface that has been deleted, and should be somehow copied to the new interface context.

Ok, I understand the problem, but this is definitely not the right way to solve it.
The code you wrote will work differently when we have or don't have the config for the new name:

* when we don't have the config, your code fixes the issue by preserving the interface context

* but when we have the config - the issue stays because nothing is changed

You should find the real reason why we don't receive the information about IPv6 addresses for the new name.
As you wrote in the linked issue – IPv4 works correctly, so this shouldn't be the overall issue, right? This should be something related only to IPv6. We can discuss this on a Tuesday meeting if you want.

ok, I managed to run the test and use 'show conf run json zebra', and I could see that the configuration aligns with the new interface name.

dut-vm# show conf run json zebra
{
  "frr-interface:lib": {
    "interface": [
      {
        "name": "dum4",
        "vrf": "default",
        "description": "bla bla bla"
      }
    ]
  }
}

From netlink perspective, the ipv6 address initially set on interface is kept on the new interface (thanks to keep_addr_on_down settings).
The problem that is being solved, then, is just a configuration consistency, and what is achieved is that, by renaming config old interface to config new interface, the information stored in the yang context ( description) is kept across the new interface.

Copy link
Contributor

@idryzhov idryzhov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The code doesn't work correctly when we already have the config for the new name:

zebra[25679]: [J1C1S-S285Q][EC 100663305] if_set_name(after): corruption detected -- interface with this name exists already in VRF 0!

We end up having two interface contexts with the same interface name.

when an interface is renamed in a given namespace, the yang
operational context is not refreshed with the new interface
name. For instance, following config can show the problem:

-step 0 : setup
zebra &
ip link add dum1 type dummy
ip link set dev dum1 up
vtysh
configure terminal
interface dum1
ip address 4.4.4.4/24
exit
exit

-step 1 : rename
ip link set dev dum1 down
ip link set dev dum1 name dum2
ip link set dev dum2 up

This scenario, from netlink perspective is processed as NETLINK_ADD
event. For each new interface, a new interface context is created.
Two consequences:
- the interface attributes of the 1st interface may be lost. this
is the case for ipv6 link local addresses.
- the yang operational context does not track the interface context
and sticks to the 1st interface configuration.

The proposal consists in reusing the previous interface, and renaming
the interface name. Also, update the yang operational context by
using the new interface name.

For operational contexts of daemons other than zebra, two events are
seen: delete old interface, and create new interface. so operational
contexts from all daemons are synced.

Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
@pguibert6WIND pguibert6WIND force-pushed the if_rename_index_ipv6 branch from f072139 to bf17db3 Compare July 19, 2021 12:17
Copy link

@polychaeta polychaeta left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your contribution to FRR!

Click for style suggestions

To apply these suggestions:

curl -s https://gist.githubusercontent.com/polychaeta/dc70002904ec658203ead53bcc63dc14/raw/6d3cc0fc2fb25ffb06ee33c53ea202f5c039631d/cr_8310_1626697058.diff | git apply

diff --git a/zebra/if_netlink.c b/zebra/if_netlink.c
index 1a3279168..c72cbf006 100644
--- a/zebra/if_netlink.c
+++ b/zebra/if_netlink.c
@@ -1598,13 +1598,15 @@ int netlink_link_change(struct nlmsghdr *h, ns_id_t ns_id, int startup)
 					if (strcmp(ifp->name, name)) {
 						if (IS_ZEBRA_DEBUG_KERNEL)
 							zlog_debug(
-								   "interface index %d was renamed from %s to %s",
-								   ifp->ifindex, ifp->name,
-								   name);
-						if_update_to_new_name(ifp, name);
+								"interface index %d was renamed from %s to %s",
+								ifp->ifindex,
+								ifp->name,
+								name);
+						if_update_to_new_name(ifp,
+								      name);
 					}
-					/* else interface with same name already present
-					 * reuse ifp pointer
+					/* else interface with same name already
+					 * present reuse ifp pointer
 					 */
 				} else
 					/* unknown interface */

If you are a new contributor to FRR, please see our contributing guidelines.

After making changes, you do not need to create a new PR. You should perform an amend or interactive rebase followed by a force push.

@LabN-CI
Copy link
Collaborator

LabN-CI commented Jul 19, 2021

💚 Basic BGPD CI results: SUCCESS, 0 tests failed

Results table
_ _
Result SUCCESS git merge/8310 bf17db3
Date 07/19/2021
Start 09:05:43
Finish 09:31:15
Run-Time 25:32
Total 1813
Pass 1813
Fail 0
Valgrind-Errors
Valgrind-Loss
Details vncregress-2021-07-19-09:05:43.txt
Log autoscript-2021-07-19-09:06:56.log.bz2
Memory 514 516 425

For details, please contact louberger

@NetDEF-CI
Copy link
Collaborator

Continuous Integration Result: SUCCESSFUL

Congratulations, this patch passed basic tests

Tested-by: NetDEF / OpenSourceRouting.org CI System

CI System Testrun URL: https://ci1.netdef.org/browse/FRR-FRRPULLREQ-20353/

This is a comment from an automated CI system.
For questions and feedback in regards to this CI system, please feel free to email
Martin Winter - mwinter (at) opensourcerouting.org.

Warnings Generated during build:

Checkout code: Successful with additional warnings
Report for if_netlink.c | 8 issues
===============================================
< WARNING: Too many leading tabs - consider code refactoring
< #1599: FILE: /tmp/f1-18650/if_netlink.c:1599:
< WARNING: line over 80 characters
< #1602: FILE: /tmp/f1-18650/if_netlink.c:1602:
< WARNING: line over 80 characters
< #1604: FILE: /tmp/f1-18650/if_netlink.c:1604:
< WARNING: line over 80 characters
< #1606: FILE: /tmp/f1-18650/if_netlink.c:1606:

@pguibert6WIND
Copy link
Member Author

with last commit, there should not be corruption in list:

ifp = if_lookup_by_name(name)
if (ifp == NULL || !CHECK_FLAG(ifp->status, ACTIVE)) {
    if (ifp == NULL) {
      ifp = if_lookup_by_index(ifi->ifi_index)
      if (ifp) {
          if (strcmp(ifp->name, name)) {
             if_update_to_new_name(ifp, name)                                 <--- if rename, then move NB config if name
          }
      } else {
         ifp = if_get_by_name(name)                                               <--- ifp not found in config, allocate it
     }
 } else {
      /* pre-configured interface, iface already present */                     <--- if iface with name already present, then reuse it
                                                                                                           <--- if vrf changed, move NB config vrf name
}

@idryzhov
Copy link
Contributor

idryzhov commented Jul 21, 2021

The log I provided is from the actual testing of your latest code. if_lookup_by_index doesn't return you the existing fake-interface context and you corrupt the list when you call if_update_to_new_name.

I am really concerned that I have to spend so much time on this. Please, thoroughly test your code for both situations – when we have and don't have the config for the new name. The first case doesn't work right now. If you manage to fix this case when we already have a fake interface context, this whole absolutely incorrect thing with renaming the NB config won't be needed at all.

@github-actions
Copy link

This pull request has conflicts, please resolve those before we can evaluate the pull request.

@donaldsharp
Copy link
Member

Closing PR hasn't been touched in 2 years

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants