Skip to content

[5.9] intermittently semi-broken network after boot on RPi4/kernel 5.9 #3850

@HiassofT

Description

@HiassofT

Describe the bug

Sometimes I get huge ping times on ethernet, network packets going out very slowly from RPi4 (up to several seconds between tcp packets) and (almost) non-functional NFS after boot.

Most of the time network performance is fine though (about 0.1ms ping time and linux part of NFS boot completed in about 20 seconds).

To reproduce
So far I've only seen the issue when netbooting the RPi4, I wouldn't rule out though that it can happen with SD card boot as well. I usually get the issue in about 1-3 out of 10 boots, sometimes it can need more than 10 reboots though.

Expected behaviour
Network performs normally (low ping, working NFS mounts), like with 5.4 kernel.

Actual behaviour
Kernel level DHCP configuration and root NFS mount sometimes take very long, NFS performance is sometimes so bad that the userspace system fails to start up.

System
4GB RPi4 with B1 stepping, Sep 03 boot eeprom, Sep 11 firmware, kernel 5.9-rc4 githash 4ba2756

Tested with current 32bit RaspiOS (bcm2711_defconfig) and LibreELEC master (custom kernel config).

Logs
Working network, as a reference: http://ix.io/2xzu , ping time is about 0.1ms

One try with semi-working NFS, systemd somewhat came up http://ix.io/2xzw , ping was about 300-500ms:

64 bytes from rpi4b1.lan (192.168.1.70): icmp_seq=1 ttl=64 time=0.494 ms
64 bytes from rpi4b1.lan (192.168.1.70): icmp_seq=2 ttl=64 time=0.442 ms
64 bytes from rpi4b1.lan (192.168.1.70): icmp_seq=3 ttl=64 time=0.360 ms
64 bytes from rpi4b1.lan (192.168.1.70): icmp_seq=4 ttl=64 time=0.464 ms
64 bytes from rpi4b1.lan (192.168.1.70): icmp_seq=5 ttl=64 time=0.425 ms
64 bytes from rpi4b1.lan (192.168.1.70): icmp_seq=6 ttl=64 time=0.344 ms
64 bytes from rpi4b1.lan (192.168.1.70): icmp_seq=7 ttl=64 time=0.404 ms
64 bytes from rpi4b1.lan (192.168.1.70): icmp_seq=8 ttl=64 time=0.345 ms
64 bytes from rpi4b1.lan (192.168.1.70): icmp_seq=9 ttl=64 time=0.425 ms
64 bytes from rpi4b1.lan (192.168.1.70): icmp_seq=10 ttl=64 time=0.387 ms
64 bytes from rpi4b1.lan (192.168.1.70): icmp_seq=11 ttl=64 time=0.330 ms
64 bytes from rpi4b1.lan (192.168.1.70): icmp_seq=12 ttl=64 time=0.435 ms

Another try, NFS pretty much unusable: http://ix.io/2xzx , ping up to 1024ms

64 bytes from rpi4b1.lan (192.168.1.70): icmp_seq=1 ttl=64 time=0.333 ms
64 bytes from rpi4b1.lan (192.168.1.70): icmp_seq=2 ttl=64 time=0.348 ms
64 bytes from rpi4b1.lan (192.168.1.70): icmp_seq=3 ttl=64 time=0.267 ms
64 bytes from rpi4b1.lan (192.168.1.70): icmp_seq=4 ttl=64 time=1024 ms
64 bytes from rpi4b1.lan (192.168.1.70): icmp_seq=5 ttl=64 time=0.312 ms
64 bytes from rpi4b1.lan (192.168.1.70): icmp_seq=6 ttl=64 time=1024 ms
64 bytes from rpi4b1.lan (192.168.1.70): icmp_seq=7 ttl=64 time=1024 ms
64 bytes from rpi4b1.lan (192.168.1.70): icmp_seq=8 ttl=64 time=0.293 ms
64 bytes from rpi4b1.lan (192.168.1.70): icmp_seq=9 ttl=64 time=0.334 ms
64 bytes from rpi4b1.lan (192.168.1.70): icmp_seq=10 ttl=64 time=0.317 ms
64 bytes from rpi4b1.lan (192.168.1.70): icmp_seq=11 ttl=64 time=0.342 ms
64 bytes from rpi4b1.lan (192.168.1.70): icmp_seq=12 ttl=64 time=0.336 ms
64 bytes from rpi4b1.lan (192.168.1.70): icmp_seq=13 ttl=64 time=0.322 ms
64 bytes from rpi4b1.lan (192.168.1.70): icmp_seq=14 ttl=64 time=1024 ms
64 bytes from rpi4b1.lan (192.168.1.70): icmp_seq=15 ttl=64 time=0.291 ms
64 bytes from rpi4b1.lan (192.168.1.70): icmp_seq=16 ttl=64 time=0.337 ms
64 bytes from rpi4b1.lan (192.168.1.70): icmp_seq=17 ttl=64 time=0.250 ms
64 bytes from rpi4b1.lan (192.168.1.70): icmp_seq=18 ttl=64 time=0.349 ms
64 bytes from rpi4b1.lan (192.168.1.70): icmp_seq=19 ttl=64 time=0.321 ms
64 bytes from rpi4b1.lan (192.168.1.70): icmp_seq=20 ttl=64 time=0.270 ms
64 bytes from rpi4b1.lan (192.168.1.70): icmp_seq=21 ttl=64 time=0.382 ms
64 bytes from rpi4b1.lan (192.168.1.70): icmp_seq=22 ttl=64 time=0.305 ms
64 bytes from rpi4b1.lan (192.168.1.70): icmp_seq=23 ttl=64 time=0.239 ms
64 bytes from rpi4b1.lan (192.168.1.70): icmp_seq=24 ttl=64 time=0.409 ms

Additional context

In other tests I had seen rather tunny ping times of 1024, 2048, 3072 and 4906ms.

I had seen this issue with the previous rpi-eeprom version as well, but only on kernel 5.9. I used that very often to netboot kernel 5.4 systems (mainly LibreELEC)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions