#
78c6e39f |
| 30-May-2024 |
Dan Williams <dan.j.williams@intel.com> |
PCI: Add missing bridge lock to pci_bus_lock()
[ Upstream commit a4e772898f8bf2e7e1cf661a12c60a5612c4afab ]
One of the true positives that the cfg_access_lock lockdep effort identified is this sequ
PCI: Add missing bridge lock to pci_bus_lock()
[ Upstream commit a4e772898f8bf2e7e1cf661a12c60a5612c4afab ]
One of the true positives that the cfg_access_lock lockdep effort identified is this sequence:
WARNING: CPU: 14 PID: 1 at drivers/pci/pci.c:4886 pci_bridge_secondary_bus_reset+0x5d/0x70 RIP: 0010:pci_bridge_secondary_bus_reset+0x5d/0x70 Call Trace: <TASK> ? __warn+0x8c/0x190 ? pci_bridge_secondary_bus_reset+0x5d/0x70 ? report_bug+0x1f8/0x200 ? handle_bug+0x3c/0x70 ? exc_invalid_op+0x18/0x70 ? asm_exc_invalid_op+0x1a/0x20 ? pci_bridge_secondary_bus_reset+0x5d/0x70 pci_reset_bus+0x1d8/0x270 vmd_probe+0x778/0xa10 pci_device_probe+0x95/0x120
Where pci_reset_bus() users are triggering unlocked secondary bus resets. Ironically pci_bus_reset(), several calls down from pci_reset_bus(), uses pci_bus_lock() before issuing the reset which locks everything *but* the bridge itself.
For the same motivation as adding:
bridge = pci_upstream_bridge(dev); if (bridge) pci_dev_lock(bridge);
to pci_reset_function() for the "bus" and "cxl_bus" reset cases, add pci_dev_lock() for @bus->self to pci_bus_lock().
Link: https://lore.kernel.org/r/171711747501.1628941.15217746952476635316.stgit@dwillia2-xfh.jf.intel.com Reported-by: Imre Deak <imre.deak@intel.com> Closes: http://lore.kernel.org/r/6657833b3b5ae_14984b29437@dwillia2-xfh.jf.intel.com.notmuch Signed-off-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Keith Busch <kbusch@kernel.org> [bhelgaas: squash in recursive locking deadlock fix from Keith Busch: https://lore.kernel.org/r/20240711193650.701834-1-kbusch@meta.com] Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Tested-by: Hans de Goede <hdegoede@redhat.com> Tested-by: Kalle Valo <kvalo@kernel.org> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
show more ...
|
#
2cc8973b |
| 18-Jun-2024 |
Lukas Wunner <lukas@wunner.de> |
PCI/DPC: Fix use-after-free on concurrent DPC and hot-removal
commit 11a1f4bc47362700fcbde717292158873fb847ed upstream.
Keith reports a use-after-free when a DPC event occurs concurrently to hot-re
PCI/DPC: Fix use-after-free on concurrent DPC and hot-removal
commit 11a1f4bc47362700fcbde717292158873fb847ed upstream.
Keith reports a use-after-free when a DPC event occurs concurrently to hot-removal of the same portion of the hierarchy:
The dpc_handler() awaits readiness of the secondary bus below the Downstream Port where the DPC event occurred. To do so, it polls the config space of the first child device on the secondary bus. If that child device is concurrently removed, accesses to its struct pci_dev cause the kernel to oops.
That's because pci_bridge_wait_for_secondary_bus() neglects to hold a reference on the child device. Before v6.3, the function was only called on resume from system sleep or on runtime resume. Holding a reference wasn't necessary back then because the pciehp IRQ thread could never run concurrently. (On resume from system sleep, IRQs are not enabled until after the resume_noirq phase. And runtime resume is always awaited before a PCI device is removed.)
However starting with v6.3, pci_bridge_wait_for_secondary_bus() is also called on a DPC event. Commit 53b54ad074de ("PCI/DPC: Await readiness of secondary bus after reset"), which introduced that, failed to appreciate that pci_bridge_wait_for_secondary_bus() now needs to hold a reference on the child device because dpc_handler() and pciehp may indeed run concurrently. The commit was backported to v5.10+ stable kernels, so that's the oldest one affected.
Add the missing reference acquisition.
Abridged stack trace:
BUG: unable to handle page fault for address: 00000000091400c0 CPU: 15 PID: 2464 Comm: irq/53-pcie-dpc 6.9.0 RIP: pci_bus_read_config_dword+0x17/0x50 pci_dev_wait() pci_bridge_wait_for_secondary_bus() dpc_reset_link() pcie_do_recovery() dpc_handler()
Fixes: 53b54ad074de ("PCI/DPC: Await readiness of secondary bus after reset") Closes: https://lore.kernel.org/r/20240612181625.3604512-3-kbusch@meta.com/ Link: https://lore.kernel.org/linux-pci/8e4bcd4116fd94f592f2bf2749f168099c480ddf.1718707743.git.lukas@wunner.de Reported-by: Keith Busch <kbusch@kernel.org> Tested-by: Keith Busch <kbusch@kernel.org> Signed-off-by: Lukas Wunner <lukas@wunner.de> Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org> Reviewed-by: Keith Busch <kbusch@kernel.org> Reviewed-by: Mika Westerberg <mika.westerberg@linux.intel.com> Cc: stable@vger.kernel.org # v5.10+ Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
show more ...
|
#
fae0e055 |
| 08-Feb-2024 |
Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> |
PCI: Do not wait for disconnected devices when resuming
[ Upstream commit 6613443ffc49d03e27f0404978f685c4eac43fba ]
On runtime resume, pci_dev_wait() is called:
pci_pm_runtime_resume() pci_
PCI: Do not wait for disconnected devices when resuming
[ Upstream commit 6613443ffc49d03e27f0404978f685c4eac43fba ]
On runtime resume, pci_dev_wait() is called:
pci_pm_runtime_resume() pci_pm_bridge_power_up_actions() pci_bridge_wait_for_secondary_bus() pci_dev_wait()
While a device is runtime suspended along with its PCI hierarchy, the device could get disconnected. In such case, the link will not come up no matter how long pci_dev_wait() waits for it.
Besides the above mentioned case, there could be other ways to get the device disconnected while pci_dev_wait() is waiting for the link to come up.
Make pci_dev_wait() exit if the device is already disconnected to avoid unnecessary delay.
The use cases of pci_dev_wait() boil down to two:
1. Waiting for the device after reset 2. pci_bridge_wait_for_secondary_bus()
The callers in both cases seem to benefit from propagating the disconnection as error even if device disconnection would be more analoguous to the case where there is no device in the first place which return 0 from pci_dev_wait(). In the case 2, it results in unnecessary marking of the devices disconnected again but that is just harmless extra work.
Also make sure compiler does not become too clever with dev->error_state and use READ_ONCE() to force a fetch for the up-to-date value.
Link: https://lore.kernel.org/r/20240208132322.4811-1-ilpo.jarvinen@linux.intel.com Reported-by: Mika Westerberg <mika.westerberg@linux.intel.com> Tested-by: Mika Westerberg <mika.westerberg@linux.intel.com> Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
show more ...
|
#
71962891 |
| 07-Mar-2024 |
Mario Limonciello <mario.limonciello@amd.com> |
PCI/PM: Avoid D3cold for HP Pavilion 17 PC/1972 PCIe Ports
[ Upstream commit 256df20c590bf0e4d63ac69330cf23faddac3e08 ]
Hewlett-Packard HP Pavilion 17 Notebook PC/1972 is an Intel Ivy Bridge system
PCI/PM: Avoid D3cold for HP Pavilion 17 PC/1972 PCIe Ports
[ Upstream commit 256df20c590bf0e4d63ac69330cf23faddac3e08 ]
Hewlett-Packard HP Pavilion 17 Notebook PC/1972 is an Intel Ivy Bridge system with a muxless AMD Radeon dGPU. Attempting to use the dGPU fails with the following sequence:
ACPI Error: Aborting method \AMD3._ON due to previous error (AE_AML_LOOP_TIMEOUT) (20230628/psparse-529) radeon 0000:01:00.0: not ready 1023ms after resume; waiting radeon 0000:01:00.0: not ready 2047ms after resume; waiting radeon 0000:01:00.0: not ready 4095ms after resume; waiting radeon 0000:01:00.0: not ready 8191ms after resume; waiting radeon 0000:01:00.0: not ready 16383ms after resume; waiting radeon 0000:01:00.0: not ready 32767ms after resume; waiting radeon 0000:01:00.0: not ready 65535ms after resume; giving up radeon 0000:01:00.0: Unable to change power state from D3cold to D0, device inaccessible
The issue is that the Root Port the dGPU is connected to can't handle the transition from D3cold to D0 so the dGPU can't properly exit runtime PM.
The existing logic in pci_bridge_d3_possible() checks for systems that are newer than 2015 to decide that D3 is safe. This would nominally work for an Ivy Bridge system (which was discontinued in 2015), but this system appears to have continued to receive BIOS updates until 2017 and so this existing logic doesn't appropriately capture it.
Add the system to bridge_d3_blacklist to prevent D3cold from being used.
Link: https://lore.kernel.org/r/20240307163709.323-1-mario.limonciello@amd.com Reported-by: Eric Heintzmann <heintzmann.eric@free.fr> Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/3229 Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Tested-by: Eric Heintzmann <heintzmann.eric@free.fr> Signed-off-by: Sasha Levin <sashal@kernel.org>
show more ...
|
#
0053891e |
| 23-Apr-2024 |
Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> |
PCI: Wait for Link Training==0 before starting Link retrain
[ Upstream commit 73cb3a35f94db723c0211ad099bce55b2155e3f0 ]
Two changes were made in link retraining logic independent of each other.
T
PCI: Wait for Link Training==0 before starting Link retrain
[ Upstream commit 73cb3a35f94db723c0211ad099bce55b2155e3f0 ]
Two changes were made in link retraining logic independent of each other.
The commit e7e39756363a ("PCI/ASPM: Avoid link retraining race") added a check to pcie_retrain_link() to ensure no Link Training is currently active to address the Implementation Note in PCIe r6.1 sec 7.5.3.7. At that time pcie_wait_for_retrain() only checked for the Link Training (LT) bit being cleared.
The commit 680e9c47a229 ("PCI: Add support for polling DLLLA to pcie_retrain_link()") generalized pcie_wait_for_retrain() into pcie_wait_for_link_status() which can wait either for LT or the Data Link Layer Link Active (DLLLA) bit with 'use_lt' argument and supporting waiting for either cleared or set using 'active' argument.
In the merge commit 1abb47390350 ("Merge branch 'pci/enumeration'"), those two divergent branches converged. The merge changed LT bit checking added in the commit e7e39756363a ("PCI/ASPM: Avoid link retraining race") to now wait for completion of any ongoing Link Training using DLLLA bit being set if 'use_lt' is false.
When 'use_lt' is false, the pseudo-code steps of what occurs in pcie_retrain_link():
1. Wait for DLLLA==1 2. Trigger link to retrain 3. Wait for DLLLA==1
Step 3 waits for the link to come up from the retraining triggered by Step 2. As Step 1 is supposed to wait for any ongoing retraining to end, using DLLLA also for it does not make sense because link training being active is still indicated using LT bit, not with DLLLA.
Correct the pcie_wait_for_link_status() parameters in Step 1 to only wait for LT==0 to ensure there is no ongoing Link Training.
This only impacts the Target Speed quirk, which is the only case where waiting for DLLLA bit is used. It currently works in the problematic case by means of link training getting initiated by hardware repeatedly and respecting the new link parameters set by the caller, which then make training succeed and bring the link up, setting DLLLA and causing pcie_wait_for_link_status() to return success. We are not supposed to rely on luck and need to make sure that LT transitioned through the inactive state though before we initiate link training by hand via RL (Retrain Link) bit.
Fixes: 1abb47390350 ("Merge branch 'pci/enumeration'") Link: https://lore.kernel.org/r/20240423130820.43824-1-ilpo.jarvinen@linux.intel.com Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
show more ...
|
#
b0f44788 |
| 30-Jan-2024 |
Johan Hovold <johan+linaro@kernel.org> |
PCI/ASPM: Fix deadlock when enabling ASPM
commit 1e560864159d002b453da42bd2c13a1805515a20 upstream.
A last minute revert in 6.7-final introduced a potential deadlock when enabling ASPM during probe
PCI/ASPM: Fix deadlock when enabling ASPM
commit 1e560864159d002b453da42bd2c13a1805515a20 upstream.
A last minute revert in 6.7-final introduced a potential deadlock when enabling ASPM during probe of Qualcomm PCIe controllers as reported by lockdep:
============================================ WARNING: possible recursive locking detected 6.7.0 #40 Not tainted -------------------------------------------- kworker/u16:5/90 is trying to acquire lock: ffffacfa78ced000 (pci_bus_sem){++++}-{3:3}, at: pcie_aspm_pm_state_change+0x58/0xdc
but task is already holding lock: ffffacfa78ced000 (pci_bus_sem){++++}-{3:3}, at: pci_walk_bus+0x34/0xbc
other info that might help us debug this: Possible unsafe locking scenario:
CPU0 ---- lock(pci_bus_sem); lock(pci_bus_sem);
*** DEADLOCK ***
Call trace: print_deadlock_bug+0x25c/0x348 __lock_acquire+0x10a4/0x2064 lock_acquire+0x1e8/0x318 down_read+0x60/0x184 pcie_aspm_pm_state_change+0x58/0xdc pci_set_full_power_state+0xa8/0x114 pci_set_power_state+0xc4/0x120 qcom_pcie_enable_aspm+0x1c/0x3c [pcie_qcom] pci_walk_bus+0x64/0xbc qcom_pcie_host_post_init_2_7_0+0x28/0x34 [pcie_qcom]
The deadlock can easily be reproduced on machines like the Lenovo ThinkPad X13s by adding a delay to increase the race window during asynchronous probe where another thread can take a write lock.
Add a new pci_set_power_state_locked() and associated helper functions that can be called with the PCI bus semaphore held to avoid taking the read lock twice.
Link: https://lore.kernel.org/r/ZZu0qx2cmn7IwTyQ@hovoldconsulting.com Link: https://lore.kernel.org/r/20240130100243.11011-1-johan+linaro@kernel.org Fixes: f93e71aea6c6 ("Revert "PCI/ASPM: Remove pcie_aspm_pm_state_change()"") Signed-off-by: Johan Hovold <johan+linaro@kernel.org> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Cc: <stable@vger.kernel.org> # 6.7 [bhelgaas: backported to v6.6.y, which contains 8cc22ba3f77c ("Revert "PCI/ASPM: Remove pcie_aspm_pm_state_change()""), a backport of f93e71aea6c6. This omits the drivers/pci/controller/dwc/pcie-qcom.c hunk that updates qcom_pcie_enable_aspm(), which was added by 9f4f3dfad8cf ("PCI: qcom: Enable ASPM for platforms supporting 1.9.0 ops"), which is not present in v6.6.28.] Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
show more ...
|
#
63b1a3d9 |
| 23-Jan-2024 |
Alex Williamson <alex.williamson@redhat.com> |
PCI: Fix active state requirement in PME polling
[ Upstream commit 41044d5360685e78a869d40a168491a70cdb7e73 ]
The commit noted in fixes added a bogus requirement that runtime PM managed devices nee
PCI: Fix active state requirement in PME polling
[ Upstream commit 41044d5360685e78a869d40a168491a70cdb7e73 ]
The commit noted in fixes added a bogus requirement that runtime PM managed devices need to be in the RPM_ACTIVE state for PME polling. In fact, only devices in low power states should be polled.
However there's still a requirement that the device config space must be accessible, which has implications for both the current state of the polled device and the parent bridge, when present. It's not sufficient to assume the bridge remains in D0 and cases have been observed where the bridge passes the D0 test, but the PM state indicates RPM_SUSPENDING and config space of the polled device becomes inaccessible during pci_pme_wakeup().
Therefore, since the bridge is already effectively required to be in the RPM_ACTIVE state, formalize this in the code and elevate the PM usage count to maintain the state while polling the subordinate device.
This resolves a regression reported in the bugzilla below where a Thunderbolt/USB4 hierarchy fails to scan for an attached NVMe endpoint downstream of a bridge in a D3hot power state.
Link: https://lore.kernel.org/r/20240123185548.1040096-1-alex.williamson@redhat.com Fixes: d3fcd7360338 ("PCI: Fix runtime PM race with PME polling") Reported-by: Sanath S <sanath.s@amd.com> Closes: https://bugzilla.kernel.org/show_bug.cgi?id=218360 Signed-off-by: Alex Williamson <alex.williamson@redhat.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Tested-by: Sanath S <sanath.s@amd.com> Reviewed-by: Rafael J. Wysocki <rafael@kernel.org> Cc: Lukas Wunner <lukas@wunner.de> Cc: Mika Westerberg <mika.westerberg@linux.intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
show more ...
|
#
8cc22ba3 |
| 01-Jan-2024 |
Bjorn Helgaas <bhelgaas@google.com> |
Revert "PCI/ASPM: Remove pcie_aspm_pm_state_change()"
commit f93e71aea6c60ebff8adbd8941e678302d377869 upstream.
This reverts commit 08d0cc5f34265d1a1e3031f319f594bd1970976c.
Michael reported that
Revert "PCI/ASPM: Remove pcie_aspm_pm_state_change()"
commit f93e71aea6c60ebff8adbd8941e678302d377869 upstream.
This reverts commit 08d0cc5f34265d1a1e3031f319f594bd1970976c.
Michael reported that when attempting to resume from suspend to RAM on ASUS mini PC PN51-BB757MDE1 (DMI model: MINIPC PN51-E1), 08d0cc5f3426 ("PCI/ASPM: Remove pcie_aspm_pm_state_change()") caused a 12-second delay with no output, followed by a reboot.
Workarounds include:
- Reverting 08d0cc5f3426 ("PCI/ASPM: Remove pcie_aspm_pm_state_change()") - Booting with "pcie_aspm=off" - Booting with "pcie_aspm.policy=performance" - "echo 0 | sudo tee /sys/bus/pci/devices/0000:03:00.0/link/l1_aspm" before suspending - Connecting a USB flash drive
Link: https://lore.kernel.org/r/20240102232550.1751655-1-helgaas@kernel.org Fixes: 08d0cc5f3426 ("PCI/ASPM: Remove pcie_aspm_pm_state_change()") Reported-by: Michael Schaller <michael@5challer.de> Link: https://lore.kernel.org/r/76c61361-b8b4-435f-a9f1-32b716763d62@5challer.de Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Cc: <stable@vger.kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
show more ...
|
#
58c91b69 |
| 10-Oct-2023 |
Bjorn Helgaas <bhelgaas@google.com> |
PCI: Use FIELD_GET() in Sapphire RX 5600 XT Pulse quirk
[ Upstream commit 04e82fa5951ca66495d7b05665eff673aa3852b4 ]
Use FIELD_GET() to remove dependences on the field position, i.e., the shift val
PCI: Use FIELD_GET() in Sapphire RX 5600 XT Pulse quirk
[ Upstream commit 04e82fa5951ca66495d7b05665eff673aa3852b4 ]
Use FIELD_GET() to remove dependences on the field position, i.e., the shift value. No functional change intended.
Separate because this isn't as trivial as the other FIELD_GET() changes.
See 907830b0fc9e ("PCI: Add a REBAR size quirk for Sapphire RX 5600 XT Pulse")
Link: https://lore.kernel.org/r/20231010204436.1000644-3-helgaas@kernel.org Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com> Cc: Nirmoy Das <nirmoy.das@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
show more ...
|
#
d4c5d6fc |
| 19-Sep-2023 |
Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> |
PCI: Use FIELD_GET() to extract Link Width
[ Upstream commit d1f9b39da4a5347150246871325190018cda8cb3 ]
Use FIELD_GET() to extract PCIe Negotiated and Maximum Link Width fields instead of custom ma
PCI: Use FIELD_GET() to extract Link Width
[ Upstream commit d1f9b39da4a5347150246871325190018cda8cb3 ]
Use FIELD_GET() to extract PCIe Negotiated and Maximum Link Width fields instead of custom masking and shifting.
Link: https://lore.kernel.org/r/20230919125648.1920-7-ilpo.jarvinen@linux.intel.com Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> [bhelgaas: drop duplicate include of <linux/bitfield.h>] Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
show more ...
|
#
eaea9f7b |
| 11-Sep-2023 |
Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> |
PCI: Do error check on own line to split long "if" conditions
[ Upstream commit d15f18053e5cc5576af9e7eef0b2a91169b6326d ]
Placing PCI error code check inside "if" condition usually results in need
PCI: Do error check on own line to split long "if" conditions
[ Upstream commit d15f18053e5cc5576af9e7eef0b2a91169b6326d ]
Placing PCI error code check inside "if" condition usually results in need to split lines. Combined with additional conditions the "if" condition becomes messy.
Convert to the usual error handling pattern with an additional variable to improve code readability. In addition, reverse the logic in pci_find_vsec_capability() to get rid of &&.
No functional changes intended.
Link: https://lore.kernel.org/r/20230911125354.25501-5-ilpo.jarvinen@linux.intel.com Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> [bhelgaas: PCI_POSSIBLE_ERROR()] Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
show more ...
|
#
5694ba13 |
| 23-Aug-2023 |
Feiyang Chen <chenfeiyang@loongson.cn> |
PCI/PM: Only read PCI_PM_CTRL register when available
For a device with no Power Management Capability, pci_power_up() previously returned 0 (success) if the platform was able to put the device in D
PCI/PM: Only read PCI_PM_CTRL register when available
For a device with no Power Management Capability, pci_power_up() previously returned 0 (success) if the platform was able to put the device in D0, which led to pci_set_full_power_state() trying to read PCI_PM_CTRL, even though it doesn't exist.
Since dev->pm_cap == 0 in this case, pci_set_full_power_state() actually read the wrong register, interpreted it as PCI_PM_CTRL, and corrupted dev->current_state. This led to messages like this in some cases:
pci 0000:01:00.0: Refused to change power state from D3hot to D0
To prevent this, make pci_power_up() always return a negative failure code if the device lacks a Power Management Capability, even if non-PCI platform power management has been able to put the device in D0. The failure will prevent pci_set_full_power_state() from trying to access PCI_PM_CTRL.
Fixes: e200904b275c ("PCI/PM: Split pci_power_up()") Link: https://lore.kernel.org/r/20230824013738.1894965-1-chenfeiyang@loongson.cn Signed-off-by: Feiyang Chen <chenfeiyang@loongson.cn> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: "Rafael J. Wysocki" <rafael@kernel.org> Cc: stable@vger.kernel.org # v5.19+
show more ...
|
#
1856d1a0 |
| 14-Dec-2022 |
Bjorn Helgaas <bhelgaas@google.com> |
PCI: Tidy config space save/restore messages
Update config space save/restore debug messages so they line up better. Previously:
nvme 0000:05:00.0: saving config space at offset 0x4 (reading 0x20
PCI: Tidy config space save/restore messages
Update config space save/restore debug messages so they line up better. Previously:
nvme 0000:05:00.0: saving config space at offset 0x4 (reading 0x20100006) nvme 0000:05:00.0: saving config space at offset 0x8 (reading 0x1080200) nvme 0000:05:00.0: saving config space at offset 0xc (reading 0x0)
nvme 0000:05:00.0: restoring config space at offset 0x4 (was 0x0, writing 0x20100006)
Now:
nvme 0000:05:00.0: save config 0x04: 0x20100006 nvme 0000:05:00.0: save config 0x08: 0x01080200 nvme 0000:05:00.0: save config 0x0c: 0x00000000
nvme 0000:05:00.0: restore config 0x04: 0x00000000 -> 0x20100006
No functional change intended. Enable these messages by setting CONFIG_DYNAMIC_DEBUG=y and adding 'dyndbg="file drivers/pci/* +p"' to kernel parameters.
Link: https://lore.kernel.org/r/20230823191831.476579-1-helgaas@kernel.org Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Acked-by: Rafael J. Wysocki <rafael@kernel.org>
show more ...
|
#
86b4ad7d |
| 24-Aug-2023 |
Bjorn Helgaas <bhelgaas@google.com> |
PCI: Fix typos in docs and comments
Fix typos in docs and comments.
Link: https://lore.kernel.org/r/20230824193712.542167-11-helgaas@kernel.org Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Re
PCI: Fix typos in docs and comments
Fix typos in docs and comments.
Link: https://lore.kernel.org/r/20230824193712.542167-11-helgaas@kernel.org Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Randy Dunlap <rdunlap@infradead.org> Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
show more ...
|
#
2b4af4b3 |
| 24-Aug-2023 |
Bjorn Helgaas <bhelgaas@google.com> |
PCI: Fix pci_bus_resetable(), pci_slot_resetable() name typos
Fix typos in the pci_bus_resetable() and pci_slot_resetable() function names. No functional change intended.
Link: https://lore.kernel
PCI: Fix pci_bus_resetable(), pci_slot_resetable() name typos
Fix typos in the pci_bus_resetable() and pci_slot_resetable() function names. No functional change intended.
Link: https://lore.kernel.org/r/20230824193712.542167-10-helgaas@kernel.org Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
show more ...
|
#
3b59ca94 |
| 07-Dec-2020 |
Bjorn Helgaas <bhelgaas@google.com> |
PCI: Simplify pci_pio_to_address()
Simplify pci_pio_to_address() by removing an unnecessary local variable. No functional change intended.
Link: https://lore.kernel.org/r/20230824193712.542167-8-he
PCI: Simplify pci_pio_to_address()
Simplify pci_pio_to_address() by removing an unnecessary local variable. No functional change intended.
Link: https://lore.kernel.org/r/20230824193712.542167-8-helgaas@kernel.org Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
show more ...
|
#
d3fcd736 |
| 03-Aug-2023 |
Alex Williamson <alex.williamson@redhat.com> |
PCI: Fix runtime PM race with PME polling
Testing that a device is not currently in a low power state provides no guarantees that the device is not imminently transitioning to such a state. Incremen
PCI: Fix runtime PM race with PME polling
Testing that a device is not currently in a low power state provides no guarantees that the device is not imminently transitioning to such a state. Increment the PM usage counter before accessing the device. Since we don't wish to wake the device for PME polling, do so only if the device is already active by using pm_runtime_get_if_active().
Link: https://lore.kernel.org/r/20230803171233.3810944-3-alex.williamson@redhat.com Signed-off-by: Alex Williamson <alex.williamson@redhat.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
show more ...
|
#
fb0171a4 |
| 17-Jul-2023 |
Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> |
PCI: Make link retraining use RMW accessors for changing LNKCTL
Don't assume that the device is fully under the control of PCI core. Use RMW capability accessors in link retraining which do proper
PCI: Make link retraining use RMW accessors for changing LNKCTL
Don't assume that the device is fully under the control of PCI core. Use RMW capability accessors in link retraining which do proper locking to avoid losing concurrent updates to the register values.
Suggested-by: Lukas Wunner <lukas@wunner.de> Fixes: 4ec73791a64b ("PCI: Work around Pericom PCIe-to-PCI bridge Retrain Link erratum") Fixes: 7d715a6c1ae5 ("PCI: add PCI Express ASPM support") Link: https://lore.kernel.org/r/20230717120503.15276-3-ilpo.jarvinen@linux.intel.com Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Acked-by: "Rafael J. Wysocki" <rafael@kernel.org>
show more ...
|
#
9e30fd26 |
| 14-Jun-2023 |
Ondrej Zary <linux@zary.sk> |
PCI/PM: Avoid putting EloPOS E2/S2/H2 PCIe Ports in D3cold
The quirk for Elo i2 introduced in commit 92597f97a40b ("PCI/PM: Avoid putting Elo i2 PCIe Ports in D3cold") is also needed by EloPOS E2/S2
PCI/PM: Avoid putting EloPOS E2/S2/H2 PCIe Ports in D3cold
The quirk for Elo i2 introduced in commit 92597f97a40b ("PCI/PM: Avoid putting Elo i2 PCIe Ports in D3cold") is also needed by EloPOS E2/S2/H2 which uses the same Continental Z2 board.
Change the quirk to match the board instead of system.
Link: https://bugzilla.kernel.org/show_bug.cgi?id=215715 Link: https://lore.kernel.org/r/20230614074253.22318-1-linux@zary.sk Signed-off-by: Ondrej Zary <linux@zary.sk> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Cc: stable@vger.kernel.org
show more ...
|
#
08e3ed12 |
| 11-Jun-2023 |
Maciej W. Rozycki <macro@orcam.me.uk> |
PCI: Add failed link recovery for device reset events
Request failed link recovery with any upstream PCIe bridge where a device has not come back after reset within PCI_RESET_WAIT time. Reset the p
PCI: Add failed link recovery for device reset events
Request failed link recovery with any upstream PCIe bridge where a device has not come back after reset within PCI_RESET_WAIT time. Reset the polling interval if recovery succeeded, otherwise continue as usual.
[bhelgaas: inline pcie_parent_link_retrain()] Link: https://lore.kernel.org/r/alpine.DEB.2.21.2306111631050.64925@angie.orcam.me.uk Signed-off-by: Maciej W. Rozycki <macro@orcam.me.uk> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
show more ...
|
#
a89c8224 |
| 11-Jun-2023 |
Maciej W. Rozycki <macro@orcam.me.uk> |
PCI: Work around PCIe link training failures
Attempt to handle cases such as with a downstream port of the ASMedia ASM2824 PCIe switch where link training never completes and the link continues swit
PCI: Work around PCIe link training failures
Attempt to handle cases such as with a downstream port of the ASMedia ASM2824 PCIe switch where link training never completes and the link continues switching between speeds indefinitely with the data link layer never reaching the active state.
It has been observed with a downstream port of the ASMedia ASM2824 Gen 3 switch wired to the upstream port of the Pericom PI7C9X2G304 Gen 2 switch, using a Delock Riser Card PCI Express x1 > 2 x PCIe x1 device, P/N 41433, wired to a SiFive HiFive Unmatched board. In this setup the switches should negotiate a link speed of 5.0GT/s, falling back to 2.5GT/s if necessary.
Instead the link continues oscillating between the two speeds, at the rate of 34-35 times per second, with link training reported repeatedly active ~84% of the time. Limiting the target link speed to 2.5GT/s with the upstream ASM2824 device makes the two switches communicate correctly. Removing the speed restriction afterwards makes the two devices switch to 5.0GT/s then.
Make use of these observations and detect the inability to train the link by checking for the Data Link Layer Link Active status bit being off while the Link Bandwidth Management Status indicating that hardware has changed the link speed or width in an attempt to correct unreliable link operation.
Restrict the speed to 2.5GT/s then with the Target Link Speed field, request a retrain and wait 200ms for the data link to go up. If this is successful, lift the restriction, letting the devices negotiate a higher speed.
Also check for a 2.5GT/s speed restriction the firmware may have already arranged and lift it too with ports of devices known to continue working afterwards (currently only ASM2824), that already report their data link being up.
[bhelgaas: reorder and squash stubs from https://lore.kernel.org/r/alpine.DEB.2.21.2306111619570.64925@angie.orcam.me.uk to avoid adding stubs that do nothing] Link: https://lore.kernel.org/r/alpine.DEB.2.21.2203022037020.56670@angie.orcam.me.uk/ Link: https://source.denx.de/u-boot/u-boot/-/commit/a398a51ccc68 Link: https://lore.kernel.org/r/alpine.DEB.2.21.2305310038540.59226@angie.orcam.me.uk Signed-off-by: Maciej W. Rozycki <macro@orcam.me.uk> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
show more ...
|
#
7604bc29 |
| 11-Jun-2023 |
Maciej W. Rozycki <macro@orcam.me.uk> |
PCI: Use pcie_wait_for_link_status() in pcie_wait_for_link_delay()
Remove a DLLLA status bit polling loop from pcie_wait_for_link_delay() and call almost identical code in pcie_wait_for_link_status(
PCI: Use pcie_wait_for_link_status() in pcie_wait_for_link_delay()
Remove a DLLLA status bit polling loop from pcie_wait_for_link_delay() and call almost identical code in pcie_wait_for_link_status() instead. This reduces the lower bound on the polling interval from 10ms to 1ms, possibly increasing the CPU load on the system in favour to reducing the wait time.
Link: https://lore.kernel.org/r/alpine.DEB.2.21.2306111611170.64925@angie.orcam.me.uk Signed-off-by: Maciej W. Rozycki <macro@orcam.me.uk> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
show more ...
|
#
680e9c47 |
| 11-Jun-2023 |
Maciej W. Rozycki <macro@orcam.me.uk> |
PCI: Add support for polling DLLLA to pcie_retrain_link()
Let the caller of pcie_retrain_link() specify whether they want to use the LT bit or the DLLLA bit of the Link Status Register to determine
PCI: Add support for polling DLLLA to pcie_retrain_link()
Let the caller of pcie_retrain_link() specify whether they want to use the LT bit or the DLLLA bit of the Link Status Register to determine if link training has completed. It is up to the caller to verify whether the use of the DLLLA bit, the implementation of which is optional, is valid for the device requested.
Link: https://lore.kernel.org/r/alpine.DEB.2.21.2306110310540.64925@angie.orcam.me.uk Signed-off-by: Maciej W. Rozycki <macro@orcam.me.uk> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
show more ...
|
#
37edd87e |
| 11-Jun-2023 |
Maciej W. Rozycki <macro@orcam.me.uk> |
PCI: Export pcie_retrain_link() for use outside ASPM
Export pcie_retrain_link() for link retrain needs outside ASPM. Struct pcie_link_state is local to ASPM and only used by pcie_retrain_link() to
PCI: Export pcie_retrain_link() for use outside ASPM
Export pcie_retrain_link() for link retrain needs outside ASPM. Struct pcie_link_state is local to ASPM and only used by pcie_retrain_link() to get at the associated PCI device, so change the operand and adjust the lone call site accordingly. Document the interface. No functional change at this point.
Link: https://lore.kernel.org/r/alpine.DEB.2.21.2306110229010.64925@angie.orcam.me.uk Signed-off-by: Maciej W. Rozycki <macro@orcam.me.uk> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
show more ...
|
#
33a176ab |
| 11-Jun-2023 |
Maciej W. Rozycki <macro@orcam.me.uk> |
PCI: Export PCIe link retrain timeout
Convert LINK_RETRAIN_TIMEOUT from jiffies to milliseconds, accordingly rename to PCIE_LINK_RETRAIN_TIMEOUT_MS, and make available via "pci.h" for the PCI core t
PCI: Export PCIe link retrain timeout
Convert LINK_RETRAIN_TIMEOUT from jiffies to milliseconds, accordingly rename to PCIE_LINK_RETRAIN_TIMEOUT_MS, and make available via "pci.h" for the PCI core to use. Use in pcie_wait_for_link_delay().
Link: https://lore.kernel.org/r/alpine.DEB.2.21.2305310030280.59226@angie.orcam.me.uk Signed-off-by: Maciej W. Rozycki <macro@orcam.me.uk> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
show more ...
|