History log of /openbmc/linux/drivers/pci/pcie/dpc.c (Results 1 – 25 of 68)
Revision (<<< Hide revision tags) (Show revision tags >>>) Date Author Comments
Revision tags: v6.6.30, v6.6.29, v6.6.28, v6.6.27, v6.6.26, v6.6.25, v6.6.24, v6.6.23, v6.6.16, v6.6.15, v6.6.14, v6.6.13, v6.6.12, v6.6.11, v6.6.10, v6.6.9, v6.6.8, v6.6.7, v6.6.6, v6.6.5, v6.6.4, v6.6.3, v6.6.2, v6.5.11, v6.6.1, v6.5.10, v6.6, v6.5.9, v6.5.8
# d9a28916 18-Oct-2023 Bjorn Helgaas <bhelgaas@google.com>

PCI/DPC: Use FIELD_GET()

[ Upstream commit 9a9eec4765737b9b2a8d6ae03de6480a5f12dd5c ]

Use FIELD_GET() to remove dependencies on the field position, i.e., the
shift value. No functional change inten

PCI/DPC: Use FIELD_GET()

[ Upstream commit 9a9eec4765737b9b2a8d6ae03de6480a5f12dd5c ]

Use FIELD_GET() to remove dependencies on the field position, i.e., the
shift value. No functional change intended.

Link: https://lore.kernel.org/r/20231018113254.17616-5-ilpo.jarvinen@linux.intel.com
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>

show more ...


Revision tags: v6.6.30, v6.6.29, v6.6.28, v6.6.27, v6.6.26, v6.6.25, v6.6.24, v6.6.23, v6.6.16, v6.6.15, v6.6.14, v6.6.13, v6.6.12, v6.6.11, v6.6.10, v6.6.9, v6.6.8, v6.6.7, v6.6.6, v6.6.5, v6.6.4, v6.6.3, v6.6.2, v6.5.11, v6.6.1, v6.5.10, v6.6, v6.5.9, v6.5.8
# d9a28916 18-Oct-2023 Bjorn Helgaas <bhelgaas@google.com>

PCI/DPC: Use FIELD_GET()

[ Upstream commit 9a9eec4765737b9b2a8d6ae03de6480a5f12dd5c ]

Use FIELD_GET() to remove dependencies on the field position, i.e., the
shift value. No functional change inten

PCI/DPC: Use FIELD_GET()

[ Upstream commit 9a9eec4765737b9b2a8d6ae03de6480a5f12dd5c ]

Use FIELD_GET() to remove dependencies on the field position, i.e., the
shift value. No functional change intended.

Link: https://lore.kernel.org/r/20231018113254.17616-5-ilpo.jarvinen@linux.intel.com
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>

show more ...


Revision tags: v6.6.30, v6.6.29, v6.6.28, v6.6.27, v6.6.26, v6.6.25, v6.6.24, v6.6.23, v6.6.16, v6.6.15, v6.6.14, v6.6.13, v6.6.12, v6.6.11, v6.6.10, v6.6.9, v6.6.8, v6.6.7, v6.6.6, v6.6.5, v6.6.4, v6.6.3, v6.6.2, v6.5.11, v6.6.1, v6.5.10, v6.6, v6.5.9, v6.5.8
# d9a28916 18-Oct-2023 Bjorn Helgaas <bhelgaas@google.com>

PCI/DPC: Use FIELD_GET()

[ Upstream commit 9a9eec4765737b9b2a8d6ae03de6480a5f12dd5c ]

Use FIELD_GET() to remove dependencies on the field position, i.e., the
shift value. No functional change inten

PCI/DPC: Use FIELD_GET()

[ Upstream commit 9a9eec4765737b9b2a8d6ae03de6480a5f12dd5c ]

Use FIELD_GET() to remove dependencies on the field position, i.e., the
shift value. No functional change intended.

Link: https://lore.kernel.org/r/20231018113254.17616-5-ilpo.jarvinen@linux.intel.com
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>

show more ...


Revision tags: v6.6.30, v6.6.29, v6.6.28, v6.6.27, v6.6.26, v6.6.25, v6.6.24, v6.6.23, v6.6.16, v6.6.15, v6.6.14, v6.6.13, v6.6.12, v6.6.11, v6.6.10, v6.6.9, v6.6.8, v6.6.7, v6.6.6, v6.6.5, v6.6.4, v6.6.3, v6.6.2, v6.5.11, v6.6.1, v6.5.10, v6.6, v6.5.9, v6.5.8
# d9a28916 18-Oct-2023 Bjorn Helgaas <bhelgaas@google.com>

PCI/DPC: Use FIELD_GET()

[ Upstream commit 9a9eec4765737b9b2a8d6ae03de6480a5f12dd5c ]

Use FIELD_GET() to remove dependencies on the field position, i.e., the
shift value. No functional change inten

PCI/DPC: Use FIELD_GET()

[ Upstream commit 9a9eec4765737b9b2a8d6ae03de6480a5f12dd5c ]

Use FIELD_GET() to remove dependencies on the field position, i.e., the
shift value. No functional change intended.

Link: https://lore.kernel.org/r/20231018113254.17616-5-ilpo.jarvinen@linux.intel.com
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>

show more ...


Revision tags: v6.6.30, v6.6.29, v6.6.28, v6.6.27, v6.6.26, v6.6.25, v6.6.24, v6.6.23, v6.6.16, v6.6.15, v6.6.14, v6.6.13, v6.6.12, v6.6.11, v6.6.10, v6.6.9, v6.6.8, v6.6.7, v6.6.6, v6.6.5, v6.6.4, v6.6.3, v6.6.2, v6.5.11, v6.6.1, v6.5.10, v6.6, v6.5.9, v6.5.8
# d9a28916 18-Oct-2023 Bjorn Helgaas <bhelgaas@google.com>

PCI/DPC: Use FIELD_GET()

[ Upstream commit 9a9eec4765737b9b2a8d6ae03de6480a5f12dd5c ]

Use FIELD_GET() to remove dependencies on the field position, i.e., the
shift value. No functional change inten

PCI/DPC: Use FIELD_GET()

[ Upstream commit 9a9eec4765737b9b2a8d6ae03de6480a5f12dd5c ]

Use FIELD_GET() to remove dependencies on the field position, i.e., the
shift value. No functional change intended.

Link: https://lore.kernel.org/r/20231018113254.17616-5-ilpo.jarvinen@linux.intel.com
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>

show more ...


# ef8a156c 18-Jan-2024 Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>

PCI/DPC: Print all TLP Prefixes, not just the first

[ Upstream commit 6568d82512b0a64809acff3d7a747362fa4288c8 ]

The TLP Prefix Log Register consists of multiple DWORDs (PCIe r6.1 sec
7.9.14.13) bu

PCI/DPC: Print all TLP Prefixes, not just the first

[ Upstream commit 6568d82512b0a64809acff3d7a747362fa4288c8 ]

The TLP Prefix Log Register consists of multiple DWORDs (PCIe r6.1 sec
7.9.14.13) but the loop in dpc_process_rp_pio_error() keeps reading from
the first DWORD, so we print only the first PIO TLP Prefix (duplicated
several times), and we never print the second, third, etc., Prefixes.

Add the iteration count based offset calculation into the config read.

Fixes: f20c4ea49ec4 ("PCI/DPC: Add eDPC support")
Link: https://lore.kernel.org/r/20240118110815.3867-1-ilpo.jarvinen@linux.intel.com
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
[bhelgaas: add user-visible details to commit log]
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>

show more ...


Revision tags: v6.5.7, v6.5.6, v6.5.5, v6.5.4, v6.5.3, v6.5.2, v6.1.51, v6.5.1, v6.1.50, v6.5, v6.1.49, v6.1.48, v6.1.46, v6.1.45, v6.1.44, v6.1.43, v6.1.42, v6.1.41, v6.1.40, v6.1.39, v6.1.38, v6.1.37, v6.1.36, v6.4, v6.1.35, v6.1.34, v6.1.33, v6.1.32, v6.1.31, v6.1.30, v6.1.29, v6.1.28, v6.1.27, v6.1.26, v6.3, v6.1.25, v6.1.24, v6.1.23
# e74b2b58 04-Apr-2023 Mika Westerberg <mika.westerberg@linux.intel.com>

PCI/PM: Drop pci_bridge_wait_for_secondary_bus() timeout parameter

All callers of pci_bridge_wait_for_secondary_bus() supply a timeout of
PCIE_RESET_READY_POLL_MS, so drop the parameter. Move the d

PCI/PM: Drop pci_bridge_wait_for_secondary_bus() timeout parameter

All callers of pci_bridge_wait_for_secondary_bus() supply a timeout of
PCIE_RESET_READY_POLL_MS, so drop the parameter. Move the definition of
PCIE_RESET_READY_POLL_MS into pci.c, the only user.

[bhelgaas: extracted from
https://lore.kernel.org/r/20230404052714.51315-3-mika.westerberg@linux.intel.com]
Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>

show more ...


Revision tags: v6.1.22, v6.1.21, v6.1.20, v6.1.19, v6.1.18, v6.1.17, v6.1.16, v6.1.15, v6.1.14, v6.1.13, v6.2, v6.1.12, v6.1.11, v6.1.10, v6.1.9, v6.1.8, v6.1.7
# 53b54ad0 15-Jan-2023 Lukas Wunner <lukas@wunner.de>

PCI/DPC: Await readiness of secondary bus after reset

pci_bridge_wait_for_secondary_bus() is called after a Secondary Bus
Reset, but not after a DPC-induced Hot Reset.

As a result, the delays presc

PCI/DPC: Await readiness of secondary bus after reset

pci_bridge_wait_for_secondary_bus() is called after a Secondary Bus
Reset, but not after a DPC-induced Hot Reset.

As a result, the delays prescribed by PCIe r6.0 sec 6.6.1 are not
observed and devices on the secondary bus may be accessed before
they're ready.

One affected device is Intel's Ponte Vecchio HPC GPU. It comprises a
PCIe switch whose upstream port is not immediately ready after reset.
Because its config space is restored too early, it remains in
D0uninitialized, its subordinate devices remain inaccessible and DPC
recovery fails with messages such as:

i915 0000:8c:00.0: can't change power state from D3cold to D0 (config space inaccessible)
intel_vsec 0000:8e:00.1: can't change power state from D3cold to D0 (config space inaccessible)
pcieport 0000:89:02.0: AER: device recovery failed

Fix it.

Link: https://lore.kernel.org/r/9f5ff00e1593d8d9a4b452398b98aa14d23fca11.1673769517.git.lukas@wunner.de
Tested-by: Ravi Kishore Koppuravuri <ravi.kishore.koppuravuri@intel.com>
Signed-off-by: Lukas Wunner <lukas@wunner.de>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Cc: stable@vger.kernel.org

show more ...


Revision tags: v6.1.6, v6.1.5, v6.0.19, v6.0.18, v6.1.4, v6.1.3, v6.0.17, v6.1.2, v6.0.16, v6.1.1, v6.0.15, v6.0.14, v6.0.13, v6.1, v6.0.12, v6.0.11, v6.0.10, v5.15.80, v6.0.9, v5.15.79, v6.0.8, v5.15.78, v6.0.7, v5.15.77, v5.15.76, v6.0.6, v6.0.5, v5.15.75, v6.0.4, v6.0.3, v6.0.2, v5.15.74, v5.15.73, v6.0.1, v5.15.72, v6.0, v5.15.71, v5.15.70, v5.15.69, v5.15.68, v5.15.67, v5.15.66, v5.15.65, v5.15.64, v5.15.63, v5.15.62, v5.15.61
# 5459c0b7 16-Aug-2022 Mika Westerberg <mika.westerberg@linux.intel.com>

PCI/DPC: Quirk PIO log size for certain Intel Root Ports

Some Root Ports on Intel Tiger Lake and Alder Lake systems support the RP
Extensions for DPC and the RP PIO Log registers but incorrectly adv

PCI/DPC: Quirk PIO log size for certain Intel Root Ports

Some Root Ports on Intel Tiger Lake and Alder Lake systems support the RP
Extensions for DPC and the RP PIO Log registers but incorrectly advertise
an RP PIO Log Size of zero. This means the kernel complains that:

DPC: RP PIO log size 0 is invalid

and if DPC is triggered, the DPC driver will not dump the RP PIO Log
registers when it should.

This is caused by a BIOS bug and should be fixed the BIOS for future CPUs.

Add a quirk to set the correct RP PIO Log size for the affected Root Ports.

Link: https://bugzilla.kernel.org/show_bug.cgi?id=209943
Link: https://lore.kernel.org/r/20220816102042.69125-1-mika.westerberg@linux.intel.com
Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>

show more ...


Revision tags: v5.15.60, v5.15.59, v5.19, v5.15.58, v5.15.57, v5.15.56, v5.15.55, v5.15.54, v5.15.53, v5.15.52, v5.15.51, v5.15.50, v5.15.49, v5.15.48, v5.15.47, v5.15.46, v5.15.45, v5.15.44, v5.15.43, v5.15.42, v5.18, v5.15.41, v5.15.40, v5.15.39, v5.15.38, v5.15.37, v5.15.36, v5.15.35, v5.15.34, v5.15.33, v5.15.32, v5.15.31, v5.17, v5.15.30, v5.15.29, v5.15.28, v5.15.27, v5.15.26, v5.15.25, v5.15.24, v5.15.23, v5.15.22, v5.15.21, v5.15.20, v5.15.19, v5.15.18, v5.15.17, v5.4.173, v5.15.16, v5.15.15, v5.16, v5.15.10, v5.15.9, v5.15.8, v5.15.7, v5.15.6, v5.15.5, v5.15.4, v5.15.3
# 0242132d 18-Nov-2021 Naveen Naidu <naveennaidu479@gmail.com>

PCI/DPC: Use PCI_POSSIBLE_ERROR() to check config reads

When config pci_ops.read() can detect failed PCI transactions, the data
returned to the CPU is PCI_ERROR_RESPONSE (~0 or 0xffffffff).

Obvious

PCI/DPC: Use PCI_POSSIBLE_ERROR() to check config reads

When config pci_ops.read() can detect failed PCI transactions, the data
returned to the CPU is PCI_ERROR_RESPONSE (~0 or 0xffffffff).

Obviously a successful PCI config read may *also* return that data if a
config register happens to contain ~0, so it doesn't definitively indicate
an error unless we know the register cannot contain ~0.

Use PCI_POSSIBLE_ERROR() to check the response we get when we read data
from hardware. This unifies PCI error response checking and makes error
checks consistent and easier to find.

Compile tested only.

Link: https://lore.kernel.org/r/9b0632f1f183432149f495cf12bdd5a72cc597a4.1637243717.git.naveennaidu479@gmail.com
Signed-off-by: Naveen Naidu <naveennaidu479@gmail.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>

show more ...


Revision tags: v5.15.2, v5.15.1, v5.15, v5.14.14, v5.14.13, v5.14.12, v5.14.11, v5.14.10, v5.14.9, v5.14.8, v5.14.7, v5.14.6, v5.10.67, v5.10.66, v5.14.5, v5.14.4, v5.10.65, v5.14.3, v5.10.64, v5.14.2, v5.10.63, v5.14.1, v5.10.62, v5.14, v5.10.61, v5.10.60, v5.10.53, v5.10.52, v5.10.51, v5.10.50, v5.10.49, v5.13, v5.10.46, v5.10.43, v5.10.42, v5.10.41, v5.10.40, v5.10.39, v5.4.119, v5.10.36, v5.10.35, v5.10.34, v5.4.116
# a97396c6 01-May-2021 Lukas Wunner <lukas@wunner.de>

PCI: pciehp: Ignore Link Down/Up caused by DPC

Downstream Port Containment (PCIe r5.0, sec. 6.2.10) disables the link upon
an error and attempts to re-enable it when instructed by the DPC driver.

A

PCI: pciehp: Ignore Link Down/Up caused by DPC

Downstream Port Containment (PCIe r5.0, sec. 6.2.10) disables the link upon
an error and attempts to re-enable it when instructed by the DPC driver.

A slot which is both DPC- and hotplug-capable is currently powered off by
pciehp once DPC is triggered (due to the link change) and powered back up
on successful recovery. That's undesirable, the slot should remain powered
so the hotplugged device remains bound to its driver. DPC notifies the
driver of the error and of successful recovery in pcie_do_recovery() and
the driver may then restore the device to working state.

Moreover, Sinan points out that turning off slot power by pciehp may foil
recovery by DPC: Power off/on is a cold reset concurrently to DPC's warm
reset. Sathyanarayanan reports extended delays or failure in link
retraining by DPC if pciehp brings down the slot.

Fix by detecting whether a Link Down event is caused by DPC and awaiting
recovery if so. On successful recovery, ignore both the Link Down and the
subsequent Link Up event.

Afterwards, check whether the link is down to detect surprise-removal or
another DPC event immediately after DPC recovery. Ensure that the
corresponding DLLSC event is not ignored by synthesizing it and invoking
irq_wake_thread() to trigger a re-run of pciehp_ist().

The IRQ threads of the hotplug and DPC drivers, pciehp_ist() and
dpc_handler(), race against each other. If pciehp is faster than DPC, it
will wait until DPC recovery completes.

Recovery consists of two steps: The first step (waiting for link
disablement) is recognizable by pciehp through a set DPC Trigger Status
bit. The second step (waiting for link retraining) is recognizable through
a newly introduced PCI_DPC_RECOVERING flag.

If DPC is faster than pciehp, neither of the two flags will be set and
pciehp may glean the recovery status from the new PCI_DPC_RECOVERED flag.
The flag is zero if DPC didn't occur at all, hence DLLSC events are not
ignored by default.

pciehp waits up to 4 seconds before assuming that DPC recovery failed and
bringing down the slot. This timeout is not taken from the spec (it
doesn't mandate one) but based on a report from Yicong Yang that DPC may
take a bit more than 3 seconds on HiSilicon's Kunpeng platform.

The timeout is necessary because the DPC Trigger Status bit may never
clear: On Root Ports which support RP Extensions for DPC, the DPC driver
polls the DPC RP Busy bit for up to 1 second before giving up on DPC
recovery. Without the timeout, pciehp would then wait indefinitely for DPC
to complete.

This commit draws inspiration from previous attempts to synchronize DPC
with pciehp:

By Sinan Kaya, August 2018:
https://lore.kernel.org/linux-pci/20180818065126.77912-1-okaya@kernel.org/

By Ethan Zhao, October 2020:
https://lore.kernel.org/linux-pci/20201007113158.48933-1-haifeng.zhao@intel.com/

By Kuppuswamy Sathyanarayanan, March 2021:
https://lore.kernel.org/linux-pci/59cb30f5e5ac6d65427ceaadf1012b2ba8dbf66c.1615606143.git.sathyanarayanan.kuppuswamy@linux.intel.com/

Link: https://lore.kernel.org/r/0be565d97438fe2a6d57354b3aa4e8626952a00b.1619857124.git.lukas@wunner.de
Reported-by: Sinan Kaya <okaya@kernel.org>
Reported-by: Ethan Zhao <haifeng.zhao@intel.com>
Reported-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
Tested-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
Tested-by: Yicong Yang <yangyicong@hisilicon.com>
Signed-off-by: Lukas Wunner <lukas@wunner.de>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Ashok Raj <ashok.raj@intel.com>
Cc: Keith Busch <kbusch@kernel.org>

show more ...


# 96c0bf09 01-May-2021 Lukas Wunner <lukas@wunner.de>

PCI: pciehp: Ignore Link Down/Up caused by DPC

[ Upstream commit a97396c6eb13f65bea894dbe7739b2e883d40a3e ]

Downstream Port Containment (PCIe r5.0, sec. 6.2.10) disables the link upon
an error and

PCI: pciehp: Ignore Link Down/Up caused by DPC

[ Upstream commit a97396c6eb13f65bea894dbe7739b2e883d40a3e ]

Downstream Port Containment (PCIe r5.0, sec. 6.2.10) disables the link upon
an error and attempts to re-enable it when instructed by the DPC driver.

A slot which is both DPC- and hotplug-capable is currently powered off by
pciehp once DPC is triggered (due to the link change) and powered back up
on successful recovery. That's undesirable, the slot should remain powered
so the hotplugged device remains bound to its driver. DPC notifies the
driver of the error and of successful recovery in pcie_do_recovery() and
the driver may then restore the device to working state.

Moreover, Sinan points out that turning off slot power by pciehp may foil
recovery by DPC: Power off/on is a cold reset concurrently to DPC's warm
reset. Sathyanarayanan reports extended delays or failure in link
retraining by DPC if pciehp brings down the slot.

Fix by detecting whether a Link Down event is caused by DPC and awaiting
recovery if so. On successful recovery, ignore both the Link Down and the
subsequent Link Up event.

Afterwards, check whether the link is down to detect surprise-removal or
another DPC event immediately after DPC recovery. Ensure that the
corresponding DLLSC event is not ignored by synthesizing it and invoking
irq_wake_thread() to trigger a re-run of pciehp_ist().

The IRQ threads of the hotplug and DPC drivers, pciehp_ist() and
dpc_handler(), race against each other. If pciehp is faster than DPC, it
will wait until DPC recovery completes.

Recovery consists of two steps: The first step (waiting for link
disablement) is recognizable by pciehp through a set DPC Trigger Status
bit. The second step (waiting for link retraining) is recognizable through
a newly introduced PCI_DPC_RECOVERING flag.

If DPC is faster than pciehp, neither of the two flags will be set and
pciehp may glean the recovery status from the new PCI_DPC_RECOVERED flag.
The flag is zero if DPC didn't occur at all, hence DLLSC events are not
ignored by default.

pciehp waits up to 4 seconds before assuming that DPC recovery failed and
bringing down the slot. This timeout is not taken from the spec (it
doesn't mandate one) but based on a report from Yicong Yang that DPC may
take a bit more than 3 seconds on HiSilicon's Kunpeng platform.

The timeout is necessary because the DPC Trigger Status bit may never
clear: On Root Ports which support RP Extensions for DPC, the DPC driver
polls the DPC RP Busy bit for up to 1 second before giving up on DPC
recovery. Without the timeout, pciehp would then wait indefinitely for DPC
to complete.

This commit draws inspiration from previous attempts to synchronize DPC
with pciehp:

By Sinan Kaya, August 2018:
https://lore.kernel.org/linux-pci/20180818065126.77912-1-okaya@kernel.org/

By Ethan Zhao, October 2020:
https://lore.kernel.org/linux-pci/20201007113158.48933-1-haifeng.zhao@intel.com/

By Kuppuswamy Sathyanarayanan, March 2021:
https://lore.kernel.org/linux-pci/59cb30f5e5ac6d65427ceaadf1012b2ba8dbf66c.1615606143.git.sathyanarayanan.kuppuswamy@linux.intel.com/

Link: https://lore.kernel.org/r/0be565d97438fe2a6d57354b3aa4e8626952a00b.1619857124.git.lukas@wunner.de
Reported-by: Sinan Kaya <okaya@kernel.org>
Reported-by: Ethan Zhao <haifeng.zhao@intel.com>
Reported-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
Tested-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
Tested-by: Yicong Yang <yangyicong@hisilicon.com>
Signed-off-by: Lukas Wunner <lukas@wunner.de>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Ashok Raj <ashok.raj@intel.com>
Cc: Keith Busch <kbusch@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>

show more ...


Revision tags: v5.10.33, v5.12, v5.10.32, v5.10.31, v5.10.30, v5.10.27, v5.10.26, v5.10.25, v5.10.24, v5.10.23, v5.10.22, v5.10.21, v5.10.20, v5.10.19, v5.4.101, v5.10.18, v5.10.17, v5.11, v5.10.16, v5.10.15, v5.10.14, v5.10, v5.8.17, v5.8.16, v5.8.15, v5.9, v5.8.14, v5.8.13, v5.8.12, v5.8.11
# 8a614499 17-Sep-2020 Lukas Wunner <lukas@wunner.de>

PCI: pciehp: Reduce noisiness on hot removal

When a PCIe card is hot-removed, the Presence Detect State and Data Link
Layer Link Active bits often do not clear simultaneously. I've seen delays
of u

PCI: pciehp: Reduce noisiness on hot removal

When a PCIe card is hot-removed, the Presence Detect State and Data Link
Layer Link Active bits often do not clear simultaneously. I've seen delays
of up to 244 msec between the two events with Thunderbolt.

After pciehp has brought down the slot in response to the first event, the
other bit may still be set. It's not discernible whether it's set because
a new card is already in the slot or if it will soon clear. So pciehp
tries to bring up the slot and in the latter case fails with a bunch of
messages, some of them at KERN_ERR severity. If the slot is no longer
occupied, the messages are false positives and annoy users.

Stuart Hayes reports the following splat on hot removal:

KERN_INFO pcieport 0000:3c:06.0: pciehp: Slot(180): Link Up
KERN_INFO pcieport 0000:3c:06.0: pciehp: Timeout waiting for Presence Detect
KERN_ERR pcieport 0000:3c:06.0: pciehp: link training error: status 0x0001
KERN_ERR pcieport 0000:3c:06.0: pciehp: Failed to check link status

Dongdong Liu complains about a similar splat:

KERN_INFO pciehp 0000:80:10.0:pcie004: Slot(36): Link Down
KERN_INFO iommu: Removing device 0000:87:00.0 from group 12
KERN_INFO pciehp 0000:80:10.0:pcie004: Slot(36): Card present
KERN_INFO pcieport 0000:80:10.0: Data Link Layer Link Active not set in 1000 msec
KERN_ERR pciehp 0000:80:10.0:pcie004: Failed to check link status

Users are particularly irritated to see a bringup attempt even though the
slot was explicitly brought down via sysfs. In a perfect world, we could
avoid this by setting Link Disable on slot bringdown and re-enabling it
upon a Presence Detect State change. In reality however, there are broken
hotplug ports which hardwire Presence Detect to zero, see 80696f991424
("PCI: pciehp: Tolerate Presence Detect hardwired to zero"). Conversely,
PCIe r1.0 hotplug ports hardwire Link Active to zero because Link Active
Reporting wasn't specified before PCIe r1.1. On unplug, some ports first
clear Presence then Link (see Stuart Hayes' splat) whereas others use the
inverse order (see Dongdong Liu's splat). To top it off, there are hotplug
ports which flap the Presence and Link bits on slot bringup, see
6c35a1ac3da6 ("PCI: pciehp: Tolerate initially unstable link").

pciehp is designed to work with all of these variants. Surplus attempts at
slot bringup are a lesser evil than not being able to bring up slots at
all. Although we could try to perfect the behavior for specific hotplug
controllers, we'd risk breaking others or increasing code complexity.

But we can certainly minimize annoyance by emitting only a single message
with KERN_INFO severity if bringup is unsuccessful:

* Drop the "Timeout waiting for Presence Detect" message in
pcie_wait_for_presence(). The sole caller of that function,
pciehp_check_link_status(), ignores the timeout and carries on. It emits
error messages of its own and I don't think this particular message adds
much value.

* There's a single error condition in pciehp_check_link_status() which
does not emit a message. Adding one allows dropping the "Failed to check
link status" message emitted by board_added() if
pciehp_check_link_status() returns a non-zero integer.

* Tone down all messages in pciehp_check_link_status() to KERN_INFO
severity and rephrase them to look as innocuous as possible. To this
end, move the message emitted by pcie_wait_for_link_delay() to its
callers.

As a result, Stuart Hayes' splat becomes:

KERN_INFO pcieport 0000:3c:06.0: pciehp: Slot(180): Link Up
KERN_INFO pcieport 0000:3c:06.0: pciehp: Slot(180): Cannot train link: status 0x0001

Dongdong Liu's splat becomes:

KERN_INFO pciehp 0000:80:10.0:pcie004: Slot(36): Card present
KERN_INFO pciehp 0000:80:10.0:pcie004: Slot(36): No link

The messages now merely serve as information that presence or link bits
were set a little longer than expected. Bringup failures which are not
false positives are still reported, albeit no longer at KERN_ERR severity.

Link: https://lore.kernel.org/linux-pci/20200310182100.102987-1-stuart.w.hayes@gmail.com/
Link: https://lore.kernel.org/linux-pci/1547649064-19019-1-git-send-email-liudongdong3@huawei.com/
Link: https://lore.kernel.org/r/b45e46fd8a6aa6930aaac9d7718c2e4b787a4e5e.1595935071.git.lukas@wunner.de
Reported-by: Stuart Hayes <stuart.w.hayes@gmail.com>
Reported-by: Dongdong Liu <liudongdong3@huawei.com>
Signed-off-by: Lukas Wunner <lukas@wunner.de>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Mika Westerberg <mika.westerberg@linux.intel.com>

show more ...


Revision tags: v5.8.10, v5.8.9, v5.8.8, v5.8.7, v5.8.6, v5.4.62, v5.8.5, v5.8.4, v5.4.61, v5.8.3, v5.4.60, v5.8.2, v5.4.59, v5.8.1, v5.4.58, v5.4.57, v5.4.56, v5.8, v5.7.12, v5.4.55, v5.7.11, v5.4.54, v5.7.10, v5.4.53, v5.4.52, v5.7.9, v5.7.8, v5.4.51, v5.4.50, v5.7.7, v5.4.49, v5.7.6, v5.7.5, v5.4.48, v5.7.4, v5.7.3, v5.4.47, v5.4.46, v5.7.2, v5.4.45, v5.7.1, v5.4.44, v5.7, v5.4.43, v5.4.42, v5.4.41, v5.4.40
# 9103aaf9 09-May-2020 Yicong Yang <yangyicong@hisilicon.com>

PCI/DPC: Print IRQ number used by port

Print IRQ number used by DPC port, like AER/PME does. It provides
convenience to track DPC interrupts counts of certain port from
/proc/interrupts.

Link: htt

PCI/DPC: Print IRQ number used by port

Print IRQ number used by DPC port, like AER/PME does. It provides
convenience to track DPC interrupts counts of certain port from
/proc/interrupts.

Link: https://lore.kernel.org/r/1589018214-52752-1-git-send-email-yangyicong@hisilicon.com
Signed-off-by: Yicong Yang <yangyicong@hisilicon.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>

show more ...


# 708b2000 26-May-2020 Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>

PCI/AER: Remove HEST/FIRMWARE_FIRST parsing for AER ownership

Commit c100beb9ccfb ("PCI/AER: Use only _OSC to determine AER ownership")
removed the use of HEST in determining AER ownership, but the

PCI/AER: Remove HEST/FIRMWARE_FIRST parsing for AER ownership

Commit c100beb9ccfb ("PCI/AER: Use only _OSC to determine AER ownership")
removed the use of HEST in determining AER ownership, but the AER driver
still used HEST to verify AER ownership in some of its APIs.

Per the ACPI spec v6.3, sec 18.3.2.4, some HEST table entries contain a
FIRMWARE_FIRST bit, but that bit does not tell us anything about ownership
of the AER capability.

Remove parsing of HEST to look for FIRMWARE_FIRST.

Add pcie_aer_is_native() for the places that need to know whether the OS
owns the AER capability.

[bhelgaas: commit log, reorder patch, remove unused __aer_firmware_first]
Link: https://lore.kernel.org/r/9a37f53a4e6ff4942ff8e18dbb20b00e16c47341.1590534843.git.sathyanarayanan.kuppuswamy@linux.intel.com
Signed-off-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>

show more ...


Revision tags: v5.4.39, v5.4.38, v5.4.37, v5.4.36, v5.4.35, v5.4.34, v5.4.33, v5.4.32, v5.4.31, v5.4.30, v5.4.29, v5.6, v5.4.28
# 894020fd 23-Mar-2020 Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>

PCI/AER: Rationalize error status register clearing

The AER interfaces to clear error status registers were a confusing mess:

- pci_cleanup_aer_uncorrect_error_status() cleared non-fatal errors

PCI/AER: Rationalize error status register clearing

The AER interfaces to clear error status registers were a confusing mess:

- pci_cleanup_aer_uncorrect_error_status() cleared non-fatal errors
from the Uncorrectable Error Status register.

- pci_aer_clear_fatal_status() cleared fatal errors from the
Uncorrectable Error Status register.

- pci_cleanup_aer_error_status_regs() cleared the Root Error Status
register (for Root Ports), the Uncorrectable Error Status register,
and the Correctable Error Status register.

Rename them to make them consistent:

From To
---------------------------------------- -------------------------------
pci_cleanup_aer_uncorrect_error_status() pci_aer_clear_nonfatal_status()
pci_aer_clear_fatal_status() pci_aer_clear_fatal_status()
pci_cleanup_aer_error_status_regs() pci_aer_clear_status()

Since pci_cleanup_aer_error_status_regs() (renamed to
pci_aer_clear_status()) is only used within drivers/pci/, move the
declaration from <linux/aer.h> to drivers/pci/pci.h.

[bhelgaas: commit log, add renames]
Link: https://lore.kernel.org/r/d1310a75dc3d28f7e8da4e99c45fbd3e60fe238e.1585000084.git.sathyanarayanan.kuppuswamy@linux.intel.com
Signed-off-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>

show more ...


# aea47413 23-Mar-2020 Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>

PCI/DPC: Expose dpc_process_error(), dpc_reset_link() for use by EDR

If firmware controls DPC, it is generally responsible for managing the DPC
capability and events, and the OS should not access th

PCI/DPC: Expose dpc_process_error(), dpc_reset_link() for use by EDR

If firmware controls DPC, it is generally responsible for managing the DPC
capability and events, and the OS should not access the DPC capability.

However, if firmware controls DPC and both the OS and the platform support
Error Disconnect Recover (EDR) notifications, the OS EDR notify handler is
responsible for recovery, and the notify handler may read/write the DPC
capability until it clears the DPC Trigger Status bit. See [1], sec 4.5.1,
table 4-6.

Expose some DPC error handling functions so they can be used by the EDR
notify handler.

[1] Downstream Port Containment Related Enhancements ECN, Jan 28, 2019,
affecting PCI Firmware Specification, Rev. 3.2
https://members.pcisig.com/wg/PCI-SIG/document/12888

Link: https://lore.kernel.org/r/e9000bb15b3a4293e81d98bb29ead7c84a6393c9.1585000084.git.sathyanarayanan.kuppuswamy@linux.intel.com
Signed-off-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>

show more ...


# 27005618 23-Mar-2020 Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>

PCI/DPC: Cache DPC capabilities in pci_init_capabilities()

Since Error Disconnect Recover needs to use DPC error handling routines
even if the OS doesn't have control of DPC, move the initalization

PCI/DPC: Cache DPC capabilities in pci_init_capabilities()

Since Error Disconnect Recover needs to use DPC error handling routines
even if the OS doesn't have control of DPC, move the initalization and
caching of DPC capabilities from the DPC driver to pci_init_capabilities().

Link: https://lore.kernel.org/r/5888380657c8b9551675b5dbd48e370e4fd2703d.1585000084.git.sathyanarayanan.kuppuswamy@linux.intel.com
Signed-off-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>

show more ...


# b6cf1a42 23-Mar-2020 Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>

PCI/ERR: Remove service dependency in pcie_do_recovery()

Previously we passed the PCIe service type parameter to pcie_do_recovery(),
where reset_link() looked up the underlying pci_port_service_driv

PCI/ERR: Remove service dependency in pcie_do_recovery()

Previously we passed the PCIe service type parameter to pcie_do_recovery(),
where reset_link() looked up the underlying pci_port_service_driver and its
.reset_link() function pointer. Instead of using this roundabout way, we
can just pass the driver-specific .reset_link() callback function when
calling pcie_do_recovery() function.

This allows us to call pcie_do_recovery() from code that is not a PCIe port
service driver, e.g., Error Disconnect Recover (EDR) support.

Remove pcie_port_find_service() and pcie_port_service_driver.reset_link
since they are now unused.

Link: https://lore.kernel.org/r/60e02b87b526cdf2930400059d98704bf0a147d1.1585000084.git.sathyanarayanan.kuppuswamy@linux.intel.com
Signed-off-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>

show more ...


# be06c1b4 23-Mar-2020 Bjorn Helgaas <bhelgaas@google.com>

PCI/DPC: Move DPC data into struct pci_dev

We only need 25 bits of data for DPC, so I don't think it's worth the
complexity of allocating and keeping track of the struct dpc_dev separately
from the

PCI/DPC: Move DPC data into struct pci_dev

We only need 25 bits of data for DPC, so I don't think it's worth the
complexity of allocating and keeping track of the struct dpc_dev separately
from the pci_dev. Move that data into the struct pci_dev.

Link: https://lore.kernel.org/r/98323eaa18080adbe5bb30846862f09f8722d4b3.1585000084.git.sathyanarayanan.kuppuswamy@linux.intel.com
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>

show more ...


Revision tags: v5.4.27, v5.4.26, v5.4.25, v5.4.24, v5.4.23, v5.4.22, v5.4.21, v5.4.20, v5.4.19, v5.4.18, v5.4.17, v5.4.16, v5.5, v5.4.15, v5.4.14, v5.4.13, v5.4.12, v5.4.11, v5.4.10, v5.4.9, v5.4.8, v5.4.7, v5.4.6, v5.4.5, v5.4.4, v5.4.3, v5.3.15, v5.4.2, v5.4.1, v5.3.14, v5.4, v5.3.13, v5.3.12, v5.3.11, v5.3.10, v5.3.9, v5.3.8
# 35a0b237 23-Oct-2019 Olof Johansson <olof@lixom.net>

PCI/DPC: Add "pcie_ports=dpc-native" to allow DPC without AER control

Prior to eed85ff4c0da7 ("PCI/DPC: Enable DPC only if AER is available"),
Linux handled DPC events regardless of whether firmware

PCI/DPC: Add "pcie_ports=dpc-native" to allow DPC without AER control

Prior to eed85ff4c0da7 ("PCI/DPC: Enable DPC only if AER is available"),
Linux handled DPC events regardless of whether firmware had granted it
ownership of AER or DPC, e.g., via _OSC.

PCIe r5.0, sec 6.2.10, recommends that the OS link control of DPC to
control of AER, so after eed85ff4c0da7, Linux handles DPC events only if it
has control of AER.

On platforms that do not grant OS control of AER via _OSC, Linux DPC
handling worked before eed85ff4c0da7 but not after.

To make Linux DPC handling work on those platforms the same way they did
before, add a "pcie_ports=dpc-native" kernel parameter that makes Linux
handle DPC events regardless of whether it has control of AER.

[bhelgaas: commit log, move pcie_ports_dpc_native to drivers/pci/]
Link: https://lore.kernel.org/r/20191023192205.97024-1-olof@lixom.net
Signed-off-by: Olof Johansson <olof@lixom.net>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>

show more ...


Revision tags: v5.3.7, v5.3.6, v5.3.5, v5.3.4, v5.3.3, v5.3.2, v5.3.1, v5.3, v5.2.14, v5.3-rc8, v5.2.13, v5.2.12, v5.2.11, v5.2.10, v5.2.9, v5.2.8, v5.2.7, v5.2.6, v5.2.5, v5.2.4, v5.2.3, v5.2.2, v5.2.1, v5.2, v5.1.16, v5.1.15, v5.1.14, v5.1.13, v5.1.12, v5.1.11, v5.1.10, v5.1.9, v5.1.8, v5.1.7, v5.1.6, v5.1.5, v5.1.4, v5.1.3, v5.1.2, v5.1.1, v5.0.14
# 10a9990c 07-May-2019 Frederick Lawler <fred@fredlawl.com>

PCI/DPC: Log messages with pci_dev, not pcie_device

Log messages with pci_dev, not pcie_device. Factor out common message
prefixes with dev_fmt().

Example output change:

- dpc 0000:00:01.1:pcie

PCI/DPC: Log messages with pci_dev, not pcie_device

Log messages with pci_dev, not pcie_device. Factor out common message
prefixes with dev_fmt().

Example output change:

- dpc 0000:00:01.1:pcie008: DPC error containment capabilities...
+ pcieport 0000:00:01.1: DPC: error containment capabilities...

Link: https://lore.kernel.org/lkml/20190509141456.223614-4-helgaas@kernel.org
Signed-off-by: Frederick Lawler <fred@fredlawl.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Keith Busch <keith.busch@intel.com>

show more ...


Revision tags: v5.1, v5.0.13, v5.0.12, v5.0.11, v5.0.10, v5.0.9, v5.0.8, v5.0.7, v5.0.6, v5.0.5, v5.0.4, v5.0.3, v4.19.29, v5.0.2, v4.19.28, v5.0.1, v4.19.27, v5.0, v4.19.26, v4.19.25, v4.19.24, v4.19.23, v4.19.22, v4.19.21
# 9f08a5d8 11-Feb-2019 Dongdong Liu <liudongdong3@huawei.com>

PCI/DPC: Fix print AER status in DPC event handling

Previously dpc_handler() called aer_get_device_error_info() without
initializing info->severity, so aer_get_device_error_info() relied on
uninitia

PCI/DPC: Fix print AER status in DPC event handling

Previously dpc_handler() called aer_get_device_error_info() without
initializing info->severity, so aer_get_device_error_info() relied on
uninitialized data.

Add dpc_get_aer_uncorrect_severity() to read the port's AER status, mask,
and severity registers and set info->severity.

Also, clear the port's AER fatal error status bits.

Fixes: 8aefa9b0d910 ("PCI/DPC: Print AER status in DPC event handling")
Signed-off-by: Dongdong Liu <liudongdong3@huawei.com>
[bhelgaas: changelog]
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Keith Busch <keith.busch@intel.com>
Cc: stable@vger.kernel.org # v4.19+

show more ...


Revision tags: v4.19.20, v4.19.19, v4.19.18, v4.19.17, v4.19.16, v4.19.15, v4.19.14, v4.19.13, v4.19.12, v4.19.11, v4.19.10, v4.19.9, v4.19.8, v4.19.7, v4.19.6, v4.19.5, v4.19.4, v4.18.20, v4.19.3, v4.18.19, v4.19.2, v4.18.18, v4.18.17, v4.19.1, v4.19, v4.18.16, v4.18.15, v4.18.14, v4.18.13, v4.18.12, v4.18.11, v4.18.10
# f0157160 20-Sep-2018 Keith Busch <keith.busch@intel.com>

PCI: Make link active reporting detection generic

The spec has timing requirements when waiting for a link to become active
after a conventional reset. Implement those hard delays when waiting for

PCI: Make link active reporting detection generic

The spec has timing requirements when waiting for a link to become active
after a conventional reset. Implement those hard delays when waiting for
an active link so pciehp and dpc drivers don't need to duplicate this.

For devices that don't support data link layer active reporting, wait the
fixed time recommended by the PCIe spec.

Signed-off-by: Keith Busch <keith.busch@intel.com>
[bhelgaas: changelog]
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Sinan Kaya <okaya@kernel.org>

show more ...


# bdb5ac85 20-Sep-2018 Keith Busch <keith.busch@intel.com>

PCI/ERR: Handle fatal error recovery

We don't need to be paranoid about the topology changing while handling an
error. If the device has changed in a hotplug capable slot, we can rely on
the presen

PCI/ERR: Handle fatal error recovery

We don't need to be paranoid about the topology changing while handling an
error. If the device has changed in a hotplug capable slot, we can rely on
the presence detection handling to react to a changing topology.

Restore the fatal error handling behavior that existed before merging DPC
with AER with 7e9084b36740 ("PCI/AER: Handle ERR_FATAL with removal and
re-enumeration of devices").

Signed-off-by: Keith Busch <keith.busch@intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Sinan Kaya <okaya@kernel.org>

show more ...


123