-
Notifications
You must be signed in to change notification settings - Fork 18
Description
I will explain via an example. Apologies if I misuse some terminology, I am not an expert in PCIe.
We are using AVED + QDMA, with one PCIe device (.0) for the AVED interface, and another for QDMA (.1)
86:00.0 Processing accelerators: Xilinx Corporation Device 50b4
Subsystem: Xilinx Corporation Device 000e
[...]
Kernel driver in use: ami
Kernel modules: ami
86:00.1 Processing accelerators: Xilinx Corporation Device 50b5
Subsystem: Xilinx Corporation Device 000e
[...]
Kernel driver in use: qdma-pf
Kernel modules: qdma_pf, ami
Sadly, when we swap our FPGA images using the AMI tool device boot command
ami_tool device_boot -d 86 -p 0
the QDMA interface stops working, because the QDMA kernel module does not see its magic number (it sees 0xFFFF instead).
We eventually found an AMD forum post with a simple fix - remove the QDMA device and rescan the bus.
We can use this fix (it boils down to writing a 1 to /sys/bus/pci/devices/0000:86:00.1/remove before running ami_tool device_boot).
However, I feel that it would be best if ami_tool called remove for all of the card's PCIe devices prior to rescanning the bus.
ami_dev_hot_reset (in sw/AMI/api/src/ami_device.c) follows these steps:
find the PCI port
read some config
set PMC GPIO
pci_remove(0x8600) // <-- only device 86:00.0
// pci_remove(0x8601) // <-- this does not happen, but I think it should!
read bridge control
set SBR
reset SBR
small sleep
pci_rescan()
Could it be changed so that pci_remove is called for each BDF produced by the card?
I am guessing that the problem is not specific to QDMA, but that any PCIe device other than the AMI one will be broken by AMI resetting the card... Does AMI have knowledge of all the BDFs that a card is producing? Could it discover them?
I'm tempted to just hack in a line like pci_remove(bdf | 1), but I am aware that is not a good fix (e.g. it will not be happy in FPGA designs that do not have a '.1' PCIe device...)