Commit 6d2c8944 authored by Kuppuswamy Sathyanarayanan's avatar Kuppuswamy Sathyanarayanan Committed by Bjorn Helgaas
Browse files

PCI/ERR: Update error status after reset_link()

Commit bdb5ac85 ("PCI/ERR: Handle fatal error recovery") uses
reset_link() to recover from fatal errors.  But during fatal error
recovery, if the initial value of error status is PCI_ERS_RESULT_DISCONNECT
or PCI_ERS_RESULT_NO_AER_DRIVER then even after successful recovery (using
reset_link()) pcie_do_recovery() will report the recovery result as
failure.  Update the status of error after reset_link().

You can reproduce this issue by triggering a SW DPC using "DPC Software
Trigger" bit in "DPC Control Register".  You should see recovery failed
dmesg log as below:

  pcieport 0000:00:16.0: DPC: containment event, status:0x1f27 source:0x0000
  pcieport 0000:00:16.0: DPC: software trigger detected
  pci 0000:04:00.0: AER: can't recover (no error_detected callback)
  pcieport 0000:00:16.0: AER: device recovery failed

Fixes: bdb5ac85 ("PCI/ERR: Handle fatal error recovery")
Link: https://lore.kernel.org/r/a255fcb3a3fdebcd90f84e08b555f1786eb8eba2.1585000084.git.sathyanarayanan.kuppuswamy@linux.intel.com


[bhelgaas: split pci_channel_io_frozen simplification to separate patch]
Signed-off-by: default avatarKuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
Signed-off-by: default avatarBjorn Helgaas <bhelgaas@google.com>
Acked-by: default avatarKeith Busch <keith.busch@intel.com>
Cc: Ashok Raj <ashok.raj@intel.com>
parent b5dfbeac
Loading
Loading
Loading
Loading
+2 −1
Original line number Diff line number Diff line
@@ -205,7 +205,8 @@ void pcie_do_recovery(struct pci_dev *dev, enum pci_channel_state state,
	pci_dbg(dev, "broadcast error_detected message\n");
	if (state == pci_channel_io_frozen) {
		pci_walk_bus(bus, report_frozen_detected, &status);
		if (reset_link(dev, service) != PCI_ERS_RESULT_RECOVERED)
		status = reset_link(dev, service);
		if (status != PCI_ERS_RESULT_RECOVERED)
			goto failed;
	} else {
		pci_walk_bus(bus, report_normal_detected, &status);