Empty OSPF routing table (Network Type Null issue?) #4178
Description
I noticed that OSPF daemon sometimes does not generate routing table, despite having populated network database, fully established neighbors, etc.
The bug appears randomly and not with all topologies. For example, in case of nobel_eu, it occurs approximately in 1/20 network startups, on a random node. It can also be triggered by link cost changes during network operation. On the other hand, in case of polska network, it does not happen at all.
I observe it at least since the January. Using master on Linux 4.19 in Mininet.
I have dumped several information from ospfd which was hit by the bug: Zagreb.txt
First, the ospfd has all LSA and fully adjacent neigbors:
Area ID: 0.0.0.0 (Backbone)
Number of interfaces in this area: Total: 3, Active: 3
Number of fully adjacent neighbors in this area: 3
Area has no authentication
SPF algorithm executed 16 times
Number of LSA 69
Number of router LSA 28. Checksum Sum 0x000dff75
Number of network LSA 41. Checksum Sum 0x0012f00c
However, routing tables are empty:
Zagreb# show ip ospf route
============ OSPF network routing table ============
============ OSPF router routing table =============
============ OSPF external routing table ===========
Link count in database for this router entry is zero:
Zagreb# show ip ospf database
OSPF Router with ID (10.127.0.38)
Router Link States (Area 0.0.0.0)
Link ID ADV Router Age Seq# CkSum Link count
10.127.0.1 10.127.0.1 355 0x80000012 0x92bd 4
10.127.0.2 10.127.0.2 366 0x8000000e 0xa7ef 3
10.127.0.6 10.127.0.6 407 0x8000000b 0x50d1 2
10.127.0.10 10.127.0.10 394 0x8000000d 0x3444 3
10.127.0.14 10.127.0.14 376 0x8000000e 0xba34 3
10.127.0.17 10.127.0.17 392 0x8000000c 0xdf9c 2
10.127.0.18 10.127.0.18 361 0x8000000e 0x37b0 3
10.127.0.22 10.127.0.22 350 0x8000000f 0x9b2e 4
10.127.0.25 10.127.0.25 405 0x8000000b 0xa3a9 2
10.127.0.26 10.127.0.26 303 0x8000000f 0x8ba1 3
10.127.0.30 10.127.0.30 385 0x8000000a 0xcd2d 2
10.127.0.34 10.127.0.34 362 0x8000000e 0x6c8a 3
10.127.0.38 10.127.0.38 419 0x8000000e 0x30c6 0 <----- HERE
Examining interfaces details shows an alarming thing:
Zagreb# show ip ospf interface
eth1 is up
ifindex 2, MTU 1500 bytes, BW 10 Mbit <UP,BROADCAST,RUNNING,PROMISC,MULTICAST>
Internet Address 10.127.0.38/30, Broadcast 10.127.0.39, Area 0.0.0.0
MTU mismatch detection: enabled
Router ID 10.127.0.38, Network Type Null, Cost: 10
Transmit Delay is 1 sec, State DR, Priority 1
Backup Designated Router (ID) 10.127.0.18, Interface Address 10.127.0.37
Saved Network-LSA sequence number 0x80000005
Multicast group memberships: OSPFAllRouters
Timer intervals configured, Hello 10s, Dead 40s, Wait 40s, Retransmit 5
Hello due in 8.153s
Neighbor Count is 1, Adjacent neighbor count is 1
eth2 is up
ifindex 4, MTU 1500 bytes, BW 10 Mbit <UP,BROADCAST,RUNNING,PROMISC,MULTICAST>
Internet Address 10.127.0.146/30, Broadcast 10.127.0.147, Area 0.0.0.0
MTU mismatch detection: enabled
Router ID 10.127.0.38, Network Type Null, Cost: 10
Transmit Delay is 1 sec, State DR, Priority 1
Backup Designated Router (ID) 10.127.0.22, Interface Address 10.127.0.145
Saved Network-LSA sequence number 0x80000005
Multicast group memberships: OSPFAllRouters
Timer intervals configured, Hello 10s, Dead 40s, Wait 40s, Retransmit 5
Hello due in 8.153s
Neighbor Count is 1, Adjacent neighbor count is 1
eth3 is up
ifindex 6, MTU 1500 bytes, BW 10 Mbit <UP,BROADCAST,RUNNING,PROMISC,MULTICAST>
Internet Address 10.127.0.158/30, Broadcast 10.127.0.159, Area 0.0.0.0
MTU mismatch detection: enabled
Router ID 10.127.0.38, Network Type Null, Cost: 10
Transmit Delay is 1 sec, State Backup, Priority 1
Backup Designated Router (ID) 10.127.0.38, Interface Address 10.127.0.158
Multicast group memberships: OSPFAllRouters
Timer intervals configured, Hello 10s, Dead 40s, Wait 40s, Retransmit 5
Hello due in 8.153s
Neighbor Count is 1, Adjacent neighbor count is 1
All interfaces have Network Type Null
. I started to wonder what if I hard code network type in configuration. I changed ospfd.conf
from:
interface eth1
ip ospf cost 10
ip ospf area 0
!
interface eth2
ip ospf cost 10
ip ospf area 0
!
interface eth3
ip ospf cost 10
ip ospf area 0
!
router ospf
ospf router-id 10.127.0.38
to:
interface eth1
ip ospf cost 10
ip ospf area 0
ip ospf network broadcast
!
interface eth2
ip ospf cost 10
ip ospf area 0
ip ospf network broadcast
!
interface eth3
ip ospf cost 10
ip ospf area 0
ip ospf network broadcast
!
router ospf
ospf router-id 10.127.0.38
Surprisingly, this solved the problem! Routing table is always generated, both on startup and after cost changes. However, specifying network type should not be required. OSPFD should by default assume BROADCAST network type when not specified. Apparently there is a problem with initialization. Default value is not being set and the network type field stays zeroed with invalid (null) value.