OCP tftp install fails with 100 GbitE Mellenox adapter #226
Description
This is not a new issue for OCP 4.9 we have seen it since OC 4.6, and very likely not OCP install specific given that this happens when the first OCP bootstrap node is installed.
The OCP bare metal install with 100 GbitE Mellenox adapter is likely an usage scenario that expose the problem. We used 3 S922 systems, each with a 100 GbitE Melenox adapter and a 1 GbitE adapter. The OCP cluster's private network is defined on the SRIOV shared 100 GbitE network interface.
After defining DHCP, dnsmasq, httpd and firewall rules and haproxy, when the OCP node is activated using HMC's System Management Service shell, I can see that the node (bootstrap) can't reach the tftp servers. The install on that node ends with error "!BA017021". I have tried different layouts where the bootstrap node is on the same server or different servers as the bastion (the node for dhcp, dnsmasq, httpd and haproxy) - neither case works. When I switched from tftp boot to virtual media on a VIOS server (for the iso image), the install worked. It is able to pull other files from the httpd server from the private network (without any firewall, dns or httpd changes).