Description
The below Call Trace is noticed in Fedora24 guest after suspend/resume operations.
[ 302.603321] INFO: rcu_sched self-detected stall on CPU
[ 302.603433] 0-...: (91 GPs behind) idle=60d/1/0 softirq=2818/2818 fqs=0
[ 302.603524](t=15081 jiffies g=4573 c=4572 q=18)
[ 302.603899] rcu_sched kthread starved for 15081 jiffies! g4573 c4572 f0x0 RCU_GP_WAIT_FQS(3) ->state=0x1
[ 302.604030] rcu_sched S 0000000000000000 0 7 2 0x00000800
[ 302.604179] Call Trace:
[ 302.604215] [c00000009eb1b8c0] [c000000001338d08] sysctl_sched_migration_cost+0x0/0x4 (unreliable)
[ 302.604375] [c00000009eb1ba90] [c000000000016274] __switch_to+0x2e4/0x410
[ 302.604464] [c00000009eb1baf0] [c0000000009ae828] __schedule+0x328/0x9d0
[ 302.604553] [c00000009eb1bb80] [c0000000009aef18] schedule+0x48/0xc0
[ 302.604640] [c00000009eb1bbb0] [c0000000009b313c] schedule_timeout+0x16c/0x340
[ 302.604743] [c00000009eb1bca0] [c000000000140eac] rcu_gp_kthread+0x8ec/0xc00
[ 302.604846] [c00000009eb1bd80] [c0000000000e5f90] kthread+0x110/0x130
[ 302.604993] [c00000009eb1be30] [c0000000000095b0] ret_from_kernel_thread+0x5c/0xac
[ 302.605145] Task dump for CPU 0:
[ 302.605191] swapper/0 R running task 0 0 0 0x00000004
[ 302.605356] Call Trace:
[ 302.605429] [c0000000013075a0] [c0000000000fb720] sched_show_task+0xe0/0x180
[ 302.605472] INFO: rcu_sched self-detected stall on CPU
[ 302.605475] 2-...: (1 ticks this GP) idle=909/1/0 softirq=3974/3974 fqs=0
[ 302.605475]
[ 302.605477](t=15081 jiffies g=4573 c=4572 q=18)
_GP_WAIT_FQS(3) ->state=0x1
[ 302.605479] rcu_sched S
[ 302.605480] 0000000000000000
[ 302.605481] 0 7 2 0x00000800
[ 302.605481] Call Trace:
+0x0/0x4
302.605484
[ 302.605487] INFO: rcu_sched self-detected stall on CPU
[ 302.605490] [c00000009eb1ba90] [c000000000016274] __switch_to+0x2e4/0x410
[ 302.605492] [c00000009eb1baf0] [c0000000009ae828] __schedule+0x328/0x9d0
[ 302.605495] [c00000009eb1bb80] [c0000000009aef18] schedule+0x48/0xc0
0
[ 302.605499] [c00000009eb1bca0] [c000000000140eac] rcu_gp_kthread+0x8ec/0xc00
[ 302.605502] [c00000009eb1bd80] [c0000000000e5f90] kthread+0x110/0x130
[ 302.605505] 7-...: (49 GPs behind) idle=e19/1/0 softirq=1338/1338 fqs=0
[ 302.605506]
/0xac
[ 302.605510](t=15081 jiffies g=4573 c=4572 q=18)
_GP_WAIT_FQS(3) ->state=0x1
[ 302.605514] rcu_sched S
[ 302.605514] 0000000000000000
[ 302.605516] 0 7 2 0x00000800
The guest has an multifunction NIC adapter hot-plugged, and after suspend/resume, the Call Trace is noticed.
Without the NIC Hot-plugged, the issue isn't noticed in the guest after suspend/resume.
Apart from the Call Trace being noticed, the guest resumes fine without any further errors being noticed.
Guest Details
uname -a
Linux localhost.localdomain 4.5.5-300.fc24.ppc64le #1 SMP Tue May 24 12:23:26 UTC 2016 ppc64le ppc64le ppc64le GNU/Linux
Steps to Reproduce
- Hot-plug an multifunction adapter to Fedora24 guest using virsh:
virsh attach-device fedora24-san hotadd-nic.xml --live
Device attached successfully
Inside the guest, the device is listed as:
lspci -nn
00:01.0 Ethernet controller [0200]: Red Hat, Inc Virtio network device [1af4:1000]
00:02.0 USB controller [0c03]: Apple Inc. KeyLargo/Intrepid USB [106b:003f]
00:03.0 SCSI storage controller [0100]: Red Hat, Inc Virtio block device [1af4:1001]
00:04.0 Unclassified device [00ff]: Red Hat, Inc Virtio memory balloon [1af4:1002]
00:05.0 Ethernet controller [0200]: Broadcom Corporation NetXtreme II BCM57800 1/10 Gigabit Ethernet [14e4:168a](rev 10)
00:05.1 Ethernet controller [0200]: Broadcom Corporation NetXtreme II BCM57800 1/10 Gigabit Ethernet [14e4:168a](rev 10)
00:05.2 Ethernet controller [0200]: Broadcom Corporation NetXtreme II BCM57800 1/10 Gigabit Ethernet [14e4:168a](rev 10)
00:05.3 Ethernet controller [0200]: Broadcom Corporation NetXtreme II BCM57800 1/10 Gigabit Ethernet [14e4:168a](rev 10)
00:0f.0 USB controller [0c03]: NEC Corporation uPD720200 USB 3.0 Host Controller [1033:0194](rev 03)
[00:05.0 till 00:05.3] are the hot-plugged device.
-
Once the device is hot-plugged to the guest, suspend the guest:
virsh suspend fedora24-san
Domain fedora24-san suspended
-
After resuming the guest, the Call Trace is noticed.
virsh resume fedora24-san
Domain fedora24-san resumed
-
Without the NIC Hot-plugged, the CallTrace is unreproducible with suspend/resume.
dmesg.txt