Skip to content

Conversation

@hedrok
Copy link
Contributor

@hedrok hedrok commented Oct 9, 2025

Make failover support dhcp-interface

  • Refactor XML a little: move common dhcp-interface properties to separate include file
  • Add failover support for dhcp-interface
  • Add test for DHCP in failover protocol
  • Change vyos.template.get_dhcp_router to use '*.lease' file instead of '.leases' file - in 'leases' there can be several last leases and current one may not be the first one in file

Command sample:

    set protocols failover route 203.0.113.0/24 dhcp-interface eth1 interface eth1
    set protocols failover route 203.0.113.0/24 dhcp-interface eth1 metric 30
    set protocols failover route 203.0.113.0/24 dhcp-interface eth1 check target 10.25.0.47
    set protocols failover route 203.0.113.0/24 dhcp-interface eth1 check timeout 1

Types of changes

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Code style update (formatting, renaming)
  • Refactoring (no functional changes)
  • Migration from an old Vyatta component to vyos-1x, please link to related PR inside obsoleted component
  • Other (please describe):

Related Task(s)

Related PR(s)

How to test / Smoketest result

I've added test case:

$ /usr/libexec/vyos/tests/smoke/cli/test_protocols_failover.py
test_01_basic (__main__.TestProtocolsFailover.test_01_basic) ... ok
test_02_vrf (__main__.TestProtocolsFailover.test_02_vrf) ... ok
test_03_config (__main__.TestProtocolsFailover.test_03_config) ... ok
test_04_dhcp (__main__.TestProtocolsFailover.test_04_dhcp) ... ok

-->

Checklist:

@github-actions
Copy link

github-actions bot commented Oct 9, 2025

👍
No issues in PR Title / Commit Title

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR adds DHCP interface support to the failover protocol, allowing routes to use DHCP-obtained gateway addresses instead of only static next-hops. The implementation includes XML configuration refactoring, enhanced route processing logic, and comprehensive test coverage.

  • Refactored XML configuration to extract common DHCP interface properties into reusable include files
  • Extended failover route configuration to support dhcp-interface alongside existing next-hop options
  • Modified DHCP router detection to use .lease files instead of .leases for better accuracy

Reviewed Changes

Copilot reviewed 9 out of 9 changed files in this pull request and generated 3 comments.

Show a summary per file
File Description
src/helpers/vyos-failover.py Added DHCP interface processing logic and route management for dynamic gateways
src/conf_mode/protocols_failover.py Updated validation to support both next-hop and dhcp-interface configurations
smoketest/scripts/cli/test_protocols_failover.py Added comprehensive test coverage for DHCP interface functionality
python/vyos/template.py Modified get_dhcp_router to use .lease files and parse new_ip_address field
interface-definitions/include/failover/protocol-common-config.xml.i Refactored to use common failover configuration include
interface-definitions/include/failover/common-failover.xml.i New common configuration include for failover settings
interface-definitions/include/dhcp-interface.xml.i Updated to use common DHCP interface properties
interface-definitions/include/dhcp-interface-properties.xml.i New include file for common DHCP interface properties
interface-definitions/include/dhcp-interface-multi.xml.i Updated to use common DHCP interface properties

Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.

@hedrok hedrok marked this pull request as draft October 10, 2025 14:42
@hedrok hedrok force-pushed the T5942-failover-dhcp-gateway branch 3 times, most recently from 6927d12 to cabfb21 Compare October 10, 2025 20:33
@hedrok hedrok marked this pull request as ready for review October 11, 2025 06:17
@sever-sever
Copy link
Member

sever-sever commented Oct 13, 2025

  1. If you do not receive a default route for the dhcp-interface, it does not create any route:
set interfaces ethernet eth1 vif 11 address 'dhcp'
set interfaces ethernet eth1 vif 12 address 'dhcp'
set protocols failover route 203.0.113.1/32 dhcp-interface eth1.11 check target 192.168.122.16
set protocols failover route 203.0.113.1/32 dhcp-interface eth1.12 check target 192.168.122.16

Expected route 203.0.113.1 does not exist

vyos@r14:~$ ping 192.168.122.16 interface eth1.11 count 2
PING 192.168.122.16 (192.168.122.16) from 10.0.11.10 eth1.11: 56(84) bytes of data.
64 bytes from 192.168.122.16: icmp_seq=1 ttl=64 time=0.311 ms
64 bytes from 192.168.122.16: icmp_seq=2 ttl=64 time=0.296 ms

--- 192.168.122.16 ping statistics ---
2 packets transmitted, 2 received, 0% packet loss, time 1025ms
rtt min/avg/max/mdev = 0.296/0.303/0.311/0.007 ms
vyos@r14:~$ 
vyos@r14:~$ 
vyos@r14:~$ ping 192.168.122.16 interface eth1.12 count 2
PING 192.168.122.16 (192.168.122.16) from 100.64.12.10 eth1.12: 56(84) bytes of data.
64 bytes from 192.168.122.16: icmp_seq=1 ttl=64 time=0.417 ms
64 bytes from 192.168.122.16: icmp_seq=2 ttl=64 time=0.342 ms

--- 192.168.122.16 ping statistics ---
2 packets transmitted, 2 received, 0% packet loss, time 1015ms
rtt min/avg/max/mdev = 0.342/0.379/0.417/0.037 ms
vyos@r14:~$ 
vyos@r14:~$ 
vyos@r14:~$ show ip route 203.0.113.1
Routing entry for 0.0.0.0/0
  Known via "static", distance 1, metric 0, best
  Last update 00:45:10 ago
  * 192.168.122.1, via eth0, weight 1

vyos@r14:~$ 
vyos@r14:~$ show int
Codes: S - State, L - Link, u - Up, D - Down, A - Admin Down
Interface    IP Address         MAC                VRF        MTU  S/L    Description
-----------  -----------------  -----------------  -------  -----  -----  -------------
eth0         192.168.122.14/24  52:54:00:77:fa:36  default   1500  u/u    WAN
eth1         -                  52:54:00:28:23:f1  default   1500  u/u    LAN
eth1.11      10.0.11.10/24      52:54:00:28:23:f1  default   1500  u/u
eth1.12      100.64.12.10/24    52:54:00:28:23:f1  default   1500  u/u

  1. Nice to have information about interface in the log:
Oct 13 12:01:13 r14 vyos-failover[6859]: Check fail for route 203.0.113.1/32 target 192.168.122.16 proto icmp 
Oct 13 12:01:36 r14 vyos-failover[6859]: Check fail for route 203.0.113.1/32 target 192.168.122.16 proto icmp 
Oct 13 12:01:59 r14 vyos-failover[6859]: Check fail for route 203.0.113.1/32 target 192.168.122.16 proto icmp 
Oct 13 12:02:22 r14 vyos-failover[6859]: Check fail for route 203.0.113.1/32 target 192.168.122.16 proto icmp 

It is not clear via which interface it fails

  1. Adding onlink drops the default route in the VRF
set interfaces ethernet eth1 vif 11 address 'dhcp'
set interfaces ethernet eth1 vif 11 vrf 'red'
set interfaces ethernet eth1 vif 12 address 'dhcp'
set protocols failover route 203.0.113.1/32 dhcp-interface eth1.12 check target 192.168.122.16
set vrf name red protocols failover route 203.0.113.1/32 dhcp-interface eth1.11 check target 192.168.122.16
set vrf name red protocols failover route 203.0.113.1/32 dhcp-interface eth1.11 onlink
set vrf name red table '1010'
commit

We see the default route:

vyos@r14# run show ip route vrf red 
Codes: K - kernel route, C - connected, L - local, S - static,
       R - RIP, O - OSPF, I - IS-IS, B - BGP, E - EIGRP, N - NHRP,
       T - Table, v - VNC, V - VNC-Direct, A - Babel, F - PBR,
       f - OpenFabric, t - Table-Direct,
       > - selected route, * - FIB route, q - queued, r - rejected, b - backup
       t - trapped, o - offload failure

VRF red:
S>* 0.0.0.0/0 [210/0] via 10.0.11.1, eth1.11, weight 1, 00:02:12
C>* 10.0.11.0/24 is directly connected, eth1.11, weight 1, 00:02:12
K * 10.0.11.0/24 [0/0] is directly connected, eth1.11, weight 1, 00:02:12
L>* 10.0.11.10/32 is directly connected, eth1.11, weight 1, 00:02:12
K>* 203.0.113.1/32 [0/1] via 10.0.11.1, eth1.11, weight 1, 00:00:43
[edit]
vyos@r14#

Add onlink and 0.0.0.0 disappears

vyos@r14# set vrf name red protocols failover route 203.0.113.1/32 dhcp-interface eth1.11 onlink 
[edit]
vyos@r14# commit
[edit]
vyos@r14# run show ip route vrf red 
Codes: K - kernel route, C - connected, L - local, S - static,
       R - RIP, O - OSPF, I - IS-IS, B - BGP, E - EIGRP, N - NHRP,
       T - Table, v - VNC, V - VNC-Direct, A - Babel, F - PBR,
       f - OpenFabric, t - Table-Direct,
       > - selected route, * - FIB route, q - queued, r - rejected, b - backup
       t - trapped, o - offload failure

VRF red:
C>* 10.0.11.0/24 is directly connected, eth1.11, weight 1, 00:03:06
K * 10.0.11.0/24 [0/0] is directly connected, eth1.11, weight 1, 00:03:06
L>* 10.0.11.10/32 is directly connected, eth1.11, weight 1, 00:03:06
K>* 203.0.113.1/32 [0/1] via 10.0.11.1, eth1.11, weight 1, 00:01:37
[edit]
vyos@r14# 

  1. Changing metric also deletes the default route (in VRF)
vyos@r14# set vrf name red protocols failover route 203.0.113.1/32 dhcp-interface eth1.11 metric 23
[edit]
vyos@r14# run show ip route vrf red 0.0.0.0
Routing entry for 0.0.0.0/0
  Known via "static", distance 210, metric 0, tag 210, vrf red, best
  Last update 00:00:46 ago
  * 10.0.11.1, via eth1.11, weight 1

[edit]
vyos@r14# compare 
[vrf name red protocols failover route 203.0.113.1/32 dhcp-interface eth1.11]
+ metric "23"

[edit]
vyos@r14# commit
[edit]
vyos@r14# 
[edit]
vyos@r14# run show ip route vrf red 0.0.0.0
[edit]
vyos@r14# 

@hedrok hedrok force-pushed the T5942-failover-dhcp-gateway branch from cabfb21 to 923b8ab Compare October 13, 2025 10:09
@hedrok
Copy link
Contributor Author

hedrok commented Oct 13, 2025

  1. This is expected behaviour: I believe we shouldn't create route to interface without gateway when there is no DHCP gateway.
  2. Done.
    3, 4. Three issues lead to such behaviour: https://vyos.dev/T7927; Any change in vrf is considered by FRRender enough trigger to restart FRR. And when there is route of any protocol code in FRRender changes cached config, and again restarts FRR for ANY change in config (because on next check there is no 'vrf' field in route):
        if 'vrf' in config_dict and 'name' in config_dict['vrf']:
            output += render_to_string('frr/zebra.vrf.route-map.frr.j2', config_dict['vrf'])
            for vrf, vrf_config in config_dict['vrf']['name'].items():
                if 'protocols' not in vrf_config:
                    continue
                for protocol in vrf_config['protocols']:
                    vrf_config['protocols'][protocol]['vrf'] = vrf

                output += inline_helper(vrf_config['protocols'])

(python/vyos/frrender.py)


if not os.path.exists(dhclient_base_dir):
self.need_dhcp_dir_cleanup = True
run(['sudo', 'mkdir', '-p', dhclient_base_dir])
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do not see real tests for the DHCP server or client. This test seems fake.
You can use a veth pair to configure a real DHCP server/client for a more realistic test.
Other things work in my case. Therefore, deleting the DHCP default gateway when changing something in the failover router appears to be another bug and is not related to this PR.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, updated test as recommended, with real DHCP server + DHCP client via virtual ethernet.

@hedrok hedrok force-pushed the T5942-failover-dhcp-gateway branch from 923b8ab to 52b958a Compare October 14, 2025 11:52
* Refactor XML a little: move common dhcp-interface properties to
  separate include file
* Add failover support for dhcp-interface
* Add test for DHCP in failover protocol
Change vyos.template.get_dhcp_router to use '*.lease' file instead
of '.leases' file - in 'leases' there can be several last leases
and current one may not be the first one in file. '*.lease' file
contains current lease.
@hedrok hedrok force-pushed the T5942-failover-dhcp-gateway branch from 52b958a to d290b26 Compare October 14, 2025 12:07
@github-actions
Copy link

CI integration 👍 passed!

Details

CI logs

  • CLI Smoketests (no interfaces) 👍 passed
  • CLI Smoketests VPP 👍 passed
  • CLI Smoketests (interfaces only) 👍 passed
  • Config tests 👍 passed
  • Config tests VPP 👍 passed
  • RAID1 tests 👍 passed
  • TPM tests 👍 passed

Copy link
Member

@sever-sever sever-sever left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Manually checked and works fine

@sever-sever sever-sever requested a review from dmbaturin October 15, 2025 11:27
Copy link
Member

@dmbaturin dmbaturin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The code looks good and I trust @sever-sever's testing.

@dmbaturin dmbaturin merged commit 33099a9 into vyos:current Oct 21, 2025
18 checks passed
@vyosbot vyosbot added mirror-initiated This PR initiated for mirror sync workflow mirror-completed and removed mirror-initiated This PR initiated for mirror sync workflow labels Oct 21, 2025
@pasikarkkainen
Copy link

Thanks a lot! Please backport to 1.4 aswell

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Development

Successfully merging this pull request may close these issues.

5 participants