nvidia-gpu: add support for PCIe port metrics
Add xyz.openbmc_project.Metric.Value interface for each of the following PCIe port metric of a ConnectX device.
PCIeErrors.CorrectableErrorCount PCIeEr
nvidia-gpu: add support for PCIe port metrics
Add xyz.openbmc_project.Metric.Value interface for each of the following PCIe port metric of a ConnectX device.
PCIeErrors.CorrectableErrorCount PCIeErrors.NonFatalErrorCount PCIeErrors.FatalErrorCount PCIeErrors.L0ToRecoveryCount PCIeErrors.ReplayCount PCIeErrors.ReplayRolloverCount PCIeErrors.NAKSentCount PCIeErrors.NAKReceivedCount PCIeErrors.UnsupportedRequestCount
PDI Patch - https://gerrit.openbmc.org/c/openbmc/phosphor-dbus-interfaces/+/84839
Tested: Build an image for nvl32-obmc machine with the following patch cherry picked.
https://gerrit.openbmc.org/c/openbmc/openbmc/+/85490
The patch cherry-picks the following patches that are currently under review.
``` 1. device tree https://lore.kernel.org/all/aRbLqH8pLWCQryhu@molberding.nvidia.com/ 2. mctpd patches https://github.com/CodeConstruct/mctp/pull/85 3. u-boot changes https://lore.kernel.org/openbmc/20251121-msx4-v1-0-fc0118b666c1@nvidia.com/T/#t 4. kernel changes as specified in the openbmc patch (for espi) 5. entity-manager changes https://gerrit.openbmc.org/c/openbmc/entity-manager/+/85455 6. platform-init changes https://gerrit.openbmc.org/c/openbmc/platform-init/+/85456 7. spi changes https://lore.kernel.org/all/20251121-w25q01jv_fixup-v1-1-3d175050db73@nvidia.com/ ```
``` root@nvl32-obmc:~# busctl tree xyz.openbmc_project.GpuSensor `- /xyz `- /xyz/openbmc_project |- /xyz/openbmc_project/inventory | |- /xyz/openbmc_project/inventory/Nvidia_ConnectX_0_PCIe | | |- /xyz/openbmc_project/inventory/Nvidia_ConnectX_0_PCIe/DOWN_0 | | |- /xyz/openbmc_project/inventory/Nvidia_ConnectX_0_PCIe/DOWN_1 | | `- /xyz/openbmc_project/inventory/Nvidia_ConnectX_0_PCIe/UP_0 | |- /xyz/openbmc_project/inventory/Nvidia_ConnectX_2_PCIe | | |- /xyz/openbmc_project/inventory/Nvidia_ConnectX_2_PCIe/DOWN_0 | | |- /xyz/openbmc_project/inventory/Nvidia_ConnectX_2_PCIe/DOWN_1 | | `- /xyz/openbmc_project/inventory/Nvidia_ConnectX_2_PCIe/UP_0 | `- /xyz/openbmc_project/inventory/Nvidia_ConnectX_3_PCIe | |- /xyz/openbmc_project/inventory/Nvidia_ConnectX_3_PCIe/DOWN_0 | |- /xyz/openbmc_project/inventory/Nvidia_ConnectX_3_PCIe/DOWN_1 | `- /xyz/openbmc_project/inventory/Nvidia_ConnectX_3_PCIe/UP_0 |- /xyz/openbmc_project/metric | |- /xyz/openbmc_project/metric/port_Nvidia_ConnectX_0_PCIe_DOWN_0 | | `- /xyz/openbmc_project/metric/port_Nvidia_ConnectX_0_PCIe_DOWN_0/pcie | | |- /xyz/openbmc_project/metric/port_Nvidia_ConnectX_0_PCIe_DOWN_0/pcie/correctable_error_count | | |- /xyz/openbmc_project/metric/port_Nvidia_ConnectX_0_PCIe_DOWN_0/pcie/fatal_error_count | | |- /xyz/openbmc_project/metric/port_Nvidia_ConnectX_0_PCIe_DOWN_0/pcie/l0_to_recovery_count | | |- /xyz/openbmc_project/metric/port_Nvidia_ConnectX_0_PCIe_DOWN_0/pcie/nak_received_count | | |- /xyz/openbmc_project/metric/port_Nvidia_ConnectX_0_PCIe_DOWN_0/pcie/nak_sent_count | | |- /xyz/openbmc_project/metric/port_Nvidia_ConnectX_0_PCIe_DOWN_0/pcie/non_fatal_error_count | | |- /xyz/openbmc_project/metric/port_Nvidia_ConnectX_0_PCIe_DOWN_0/pcie/replay_count | | |- /xyz/openbmc_project/metric/port_Nvidia_ConnectX_0_PCIe_DOWN_0/pcie/replay_rollover_count | | `- /xyz/openbmc_project/metric/port_Nvidia_ConnectX_0_PCIe_DOWN_0/pcie/unsupported_request_count
root@nvl32-obmc:~# busctl introspect xyz.openbmc_project.GpuSensor /xyz/openbmc_project/metric/port_Nvidia_ConnectX_3_PCIe_DOWN_1/pcie/l0_to_recovery_count NAME TYPE SIGNATURE RESULT/VALUE FLAGS org.freedesktop.DBus.Introspectable interface - - - .Introspect method - s - org.freedesktop.DBus.Peer interface - - - .GetMachineId method - s - .Ping method - - - org.freedesktop.DBus.Properties interface - - - .Get method ss v - .GetAll method s a{sv} - .Set method ssv - - .PropertiesChanged signal sa{sv}as - - xyz.openbmc_project.Association.Definitions interface - - - .Associations property a(sss) 1 "measuring" "measured_by" "/xyz/ope... emits-change xyz.openbmc_project.Metric.Value interface - - - .Unit property s "xyz.openbmc_project.Metric.Value.Uni... emits-change .Value property d 1 emits-change ```
Change-Id: I3379c09346653d6a6bf2921bf765f0adf5a22098 Signed-off-by: Harshit Aghera <haghera@nvidia.com>
show more ...
|