nvidia-gpu - OpenGrok history log for /openbmc/dbus-sensors/src/nvidia-gpu

Revision	Date	Author	Comments (<<< Hide modified files) (Show modified files >>>)
b139302c	08-Jan-2026	Eric Liu <liuer@nvidia.com>	nvidia-gpu: add BoostClockFrequency property Implement BoostClockFrequency property for NVIDIA GPU inventory to expose the default boost clock frequency of GPU accelerators. The property is added t nvidia-gpu: add BoostClockFrequency property Implement BoostClockFrequency property for NVIDIA GPU inventory to expose the default boost clock frequency of GPU accelerators. The property is added to xyz.openbmc_project.Inventory.Item .Accelerator interface, utilizing the existing MCTP VDM Property ID 21 (DEFAULT_BOOST_CLOCKS) to query the GPU hardware over MCTP and populate the property value. Changes: - src/nvidia-gpu/NvidiaGpuMctpVdm.hpp: Add uint64_t to InventoryValue variant to support numeric clock speed values. - src/nvidia-gpu/NvidiaGpuMctpVdm.cpp: Add DEFAULT_BOOST_CLOCKS case to decodeInventoryData to parse uint64_t clock speed from MCTP response payload. - src/nvidia-gpu/Inventory.cpp: Register BoostClockFrequency property on Accelerator interface, add DEFAULT_BOOST_CLOCKS to properties query map, and handle uint64_t response in handleInventoryPropertyResponse. Tested: Build an image for nvl32-obmc machine with the following patch cherry picked. https://gerrit.openbmc.org/c/openbmc/openbmc/+/85763 https://gerrit.openbmc.org/c/openbmc/dbus-sensors/+/85080 Verified via busctl that BoostClockFrequency property appears under xyz.openbmc_project.Inventory.Item.Accelerator interface for GPU devices and contains the correct boost clock value (e.g., 2430 MHz). Confirmed successful MCTP query and property update through nvidiagpusensor service logs. Change-Id: I3d7410e1b1a455a81263c89f63ac1c6338eeefe1 Signed-off-by: Eric Liu <liuer@nvidia.com> show more ... Inventory.cpp NvidiaGpuMctpVdm.cpp NvidiaGpuMctpVdm.hpp
4c0a0b45	29-Dec-2025	Ender Hsieh <andhsieh@nvidia.com>	nvidia-gpu: implement PhysicalContext interface This change implements the xyz.openbmc_project.Common.PhysicalContext interface for NVIDIA GPU sensors (Energy, Power, Temperature, and Voltage). This nvidia-gpu: implement PhysicalContext interface This change implements the xyz.openbmc_project.Common.PhysicalContext interface for NVIDIA GPU sensors (Energy, Power, Temperature, and Voltage). This allows sensors to expose their hardware context information to external interfaces like Redfish. Instead of hardcoding the PhysicalContext value, this implementation uses a device type enum (gpu::DeviceIdentification) to determine the appropriate PhysicalContext. A helper function maps device types to their corresponding D-Bus PhysicalContext values, ensuring proper separation of concerns. For GPU devices, the Type property is set to 'GPU'. For other device types (SMA, PCIe), no PhysicalContext interface is created, keeping their D-Bus representation clean. This implementation follows the interface definition introduced in: https://gerrit.openbmc.org/c/openbmc/phosphor-dbus-interfaces/+/86504 Changes: - src/nvidia-gpu/NvidiaSensorUtils.hpp: New helper file containing deviceTypeToPhysicalContext() function that maps DeviceIdentification enum to D-Bus PhysicalContext paths. - src/nvidia-gpu/.hpp: Update sensor constructors to accept gpu::DeviceIdentification deviceType parameter (no default value). - src/nvidia-gpu/Sensor.cpp: Use helper function to determine PhysicalContext. Conditionally create interface only when a valid context is returned. Register 'Type' property with the mapped value. - src/nvidia-gpu/NvidiaGpuDevice.cpp: Pass DEVICE_GPU enum for all GPU sensors. Fix io member initialization in constructor. - src/nvidia-gpu/NvidiaSmaDevice.cpp: Pass DEVICE_SMA enum for SMA sensors (no PhysicalContext interface created). Design rationale: - GpuDevice class doesn't need to know D-Bus implementation details - Centralized mapping function makes maintenance easier - Type-safe enum prevents typos and provides compile-time checking - Automatic handling for different device types without conditional logic in device classes Tested: Build an image for nvl32-obmc machine with the following patch cherry picked. https://gerrit.openbmc.org/c/openbmc/openbmc/+/85490 Verified via busctl that the Type property appears under xyz.openbmc_project.Common.PhysicalContext interface for GPU sensors only and contains the correct value 'GPU'. SMA device sensors do not have this interface, as expected. Depends-On: I83dcbe4810139fb92fddf6b099f5a1a057e7e05e Change-Id: I1d5abfa5d4416af3565bf315e0f28cb6af56f14c Signed-off-by: Ender Hsieh <andhsieh@nvidia.com> show more ... /openbmc/dbus-sensors/docs/mctp/Makefile /openbmc/dbus-sensors/docs/mctp/OWNERS /openbmc/dbus-sensors/docs/mctp/openbmc_usb_mctp.cfg /openbmc/dbus-sensors/docs/mctp/openbmc_usb_mctp.tla /openbmc/dbus-sensors/src/mctp/MCTPReactor.cpp /openbmc/dbus-sensors/src/mctp/MCTPReactor.hpp NvidiaGpuDevice.cpp NvidiaGpuEnergySensor.cpp NvidiaGpuEnergySensor.hpp NvidiaGpuPowerSensor.cpp NvidiaGpuPowerSensor.hpp NvidiaGpuSensor.cpp NvidiaGpuSensor.hpp NvidiaGpuVoltageSensor.cpp NvidiaGpuVoltageSensor.hpp NvidiaSensorUtils.hpp NvidiaSmaDevice.cpp /openbmc/dbus-sensors/src/tests/test_MCTPReactor.cpp
7427aeef	17-Oct-2025	Harshit Aghera <haghera@nvidia.com>	nvidia-gpu: add ConnectX Ethernet Port Metrics Add xyz.openbmc_project.Metric.Value interface for each of the following Ethernet port metrics of a ConnectX device. - TXBytes - RXBytes - RXMulticast nvidia-gpu: add ConnectX Ethernet Port Metrics Add xyz.openbmc_project.Metric.Value interface for each of the following Ethernet port metrics of a ConnectX device. - TXBytes - RXBytes - RXMulticastFrames - TXMulticastFrames - RXUnicastFrames - TXUnicastFrames - RXBroadcastFrames - TXBroadcastFrames - RXFCSErrors - RXFrameAlignmentErrors - RXFalseCarrierErrors - RXUndersizeFrames - RXOversizeFrames - RXPauseXONFrames - RXPauseXOFFFrames - TXPauseXONFrames - TXPauseXOFFFrames - TXSingleCollisions - TXMultipleCollisions - TXLateCollisions - TXExcessiveCollisions PDI Patch - https://gerrit.openbmc.org/c/openbmc/phosphor-dbus-interfaces/+/84847 Tested: Build an image for nvl32-obmc machine with the following patch cherry picked. https://gerrit.openbmc.org/c/openbmc/entity-manager/+/84257 https://gerrit.openbmc.org/c/openbmc/openbmc/+/85490 The openbmc patch cherry-picks the following patches that are currently under review. ``` 1. device tree https://lore.kernel.org/all/aRbLqH8pLWCQryhu@molberding.nvidia.com/ 2. mctpd patches https://github.com/CodeConstruct/mctp/pull/85 3. u-boot changes https://lore.kernel.org/openbmc/20251121-msx4-v1-0-fc0118b666c1@nvidia.com/T/#t 4. kernel changes as specified in the openbmc patch (for espi) 5. entity-manager changes https://gerrit.openbmc.org/c/openbmc/entity-manager/+/85455 6. platform-init changes https://gerrit.openbmc.org/c/openbmc/platform-init/+/85456 7. spi changes https://lore.kernel.org/all/20251121-w25q01jv_fixup-v1-1-3d175050db73@nvidia.com/ ``` ``` root@nvl32-obmc:~# busctl tree xyz.openbmc_project.GpuSensor `- /xyz `- /xyz/openbmc_project \|- /xyz/openbmc_project/inventory \| \|- /xyz/openbmc_project/inventory/Nvidia_ConnectX_0_NIC \| \| \|- /xyz/openbmc_project/inventory/Nvidia_ConnectX_0_NIC/Port_1 \| \| `- /xyz/openbmc_project/inventory/Nvidia_ConnectX_0_NIC/Port_2 \| \|- /xyz/openbmc_project/inventory/Nvidia_ConnectX_0_PCIe \| \| \|- /xyz/openbmc_project/inventory/Nvidia_ConnectX_0_PCIe/DOWN_0 \| \| \|- /xyz/openbmc_project/inventory/Nvidia_ConnectX_0_PCIe/DOWN_1 \| \| `- /xyz/openbmc_project/inventory/Nvidia_ConnectX_0_PCIe/UP_0 \| \|- /xyz/openbmc_project/inventory/Nvidia_ConnectX_2_NIC \| \| \|- /xyz/openbmc_project/inventory/Nvidia_ConnectX_2_NIC/Port_1 \| \| `- /xyz/openbmc_project/inventory/Nvidia_ConnectX_2_NIC/Port_2 \| \|- /xyz/openbmc_project/inventory/Nvidia_ConnectX_2_PCIe \| \| \|- /xyz/openbmc_project/inventory/Nvidia_ConnectX_2_PCIe/DOWN_0 \| \| \|- /xyz/openbmc_project/inventory/Nvidia_ConnectX_2_PCIe/DOWN_1 \| \| `- /xyz/openbmc_project/inventory/Nvidia_ConnectX_2_PCIe/UP_0 \| \|- /xyz/openbmc_project/inventory/Nvidia_ConnectX_3_NIC \| \| \|- /xyz/openbmc_project/inventory/Nvidia_ConnectX_3_NIC/Port_1 \| \| `- /xyz/openbmc_project/inventory/Nvidia_ConnectX_3_NIC/Port_2 \| `- /xyz/openbmc_project/inventory/Nvidia_ConnectX_3_PCIe \| \|- /xyz/openbmc_project/inventory/Nvidia_ConnectX_3_PCIe/DOWN_0 \| \|- /xyz/openbmc_project/inventory/Nvidia_ConnectX_3_PCIe/DOWN_1 \| `- /xyz/openbmc_project/inventory/Nvidia_ConnectX_3_PCIe/UP_0 \|- /xyz/openbmc_project/metric \| \|- /xyz/openbmc_project/metric/port_Nvidia_ConnectX_0_NIC_Port_1 \| \| `- /xyz/openbmc_project/metric/port_Nvidia_ConnectX_0_NIC_Port_1/nic \| \| \|- /xyz/openbmc_project/metric/port_Nvidia_ConnectX_0_NIC_Port_1/nic/rx_broadcast_frames \| \| \|- /xyz/openbmc_project/metric/port_Nvidia_ConnectX_0_NIC_Port_1/nic/rx_bytes \| \| \|- /xyz/openbmc_project/metric/port_Nvidia_ConnectX_0_NIC_Port_1/nic/rx_false_carrier_errors \| \| \|- /xyz/openbmc_project/metric/port_Nvidia_ConnectX_0_NIC_Port_1/nic/rx_fcs_errors \| \| \|- /xyz/openbmc_project/metric/port_Nvidia_ConnectX_0_NIC_Port_1/nic/rx_frame_alignment_errors \| \| \|- /xyz/openbmc_project/metric/port_Nvidia_ConnectX_0_NIC_Port_1/nic/rx_multicast_frames \| \| \|- /xyz/openbmc_project/metric/port_Nvidia_ConnectX_0_NIC_Port_1/nic/rx_oversize_frames \| \| \|- /xyz/openbmc_project/metric/port_Nvidia_ConnectX_0_NIC_Port_1/nic/rx_pause_xoff_frames \| \| \|- /xyz/openbmc_project/metric/port_Nvidia_ConnectX_0_NIC_Port_1/nic/rx_pause_xon_frames \| \| \|- /xyz/openbmc_project/metric/port_Nvidia_ConnectX_0_NIC_Port_1/nic/rx_undersize_frames \| \| \|- /xyz/openbmc_project/metric/port_Nvidia_ConnectX_0_NIC_Port_1/nic/rx_unicast_frames \| \| \|- /xyz/openbmc_project/metric/port_Nvidia_ConnectX_0_NIC_Port_1/nic/tx_broadcast_frames \| \| \|- /xyz/openbmc_project/metric/port_Nvidia_ConnectX_0_NIC_Port_1/nic/tx_bytes \| \| \|- /xyz/openbmc_project/metric/port_Nvidia_ConnectX_0_NIC_Port_1/nic/tx_excessive_collisions \| \| \|- /xyz/openbmc_project/metric/port_Nvidia_ConnectX_0_NIC_Port_1/nic/tx_late_collisions \| \| \|- /xyz/openbmc_project/metric/port_Nvidia_ConnectX_0_NIC_Port_1/nic/tx_multicast_frames \| \| \|- /xyz/openbmc_project/metric/port_Nvidia_ConnectX_0_NIC_Port_1/nic/tx_multiple_collisions \| \| \|- /xyz/openbmc_project/metric/port_Nvidia_ConnectX_0_NIC_Port_1/nic/tx_pause_xoff_frames \| \| \|- /xyz/openbmc_project/metric/port_Nvidia_ConnectX_0_NIC_Port_1/nic/tx_pause_xon_frames \| \| \|- /xyz/openbmc_project/metric/port_Nvidia_ConnectX_0_NIC_Port_1/nic/tx_single_collisions \| \| `- /xyz/openbmc_project/metric/port_Nvidia_ConnectX_0_NIC_Port_1/nic/tx_unicast_frames root@nvl32-obmc:~# busctl introspect xyz.openbmc_project.GpuSensor /xyz/openbmc_project/metric/port_Nvidia_ConnectX_3_NIC_Port_2/nic/rx_bytes NAME TYPE SIGNATURE RESULT/VALUE FLAGS org.freedesktop.DBus.Introspectable interface - - - .Introspect method - s - org.freedesktop.DBus.Peer interface - - - .GetMachineId method - s - .Ping method - - - org.freedesktop.DBus.Properties interface - - - .Get method ss v - .GetAll method s a{sv} - .Set method ssv - - .PropertiesChanged signal sa{sv}as - - xyz.openbmc_project.Association.Definitions interface - - - .Associations property a(sss) 1 "measuring" "measured_by" "/xyz/ope... emits-change xyz.openbmc_project.Metric.Value interface - - - .Unit property s "xyz.openbmc_project.Metric.Value.Uni... emits-change .Value property d 0 emits-change ``` Change-Id: I30123e35b759182039cb6f25526fafe733c0f354 Signed-off-by: Harshit Aghera <haghera@nvidia.com> show more ... NvidiaDeviceDiscovery.cpp NvidiaDeviceDiscovery.hpp NvidiaEthPort.cpp NvidiaEthPort.hpp NvidiaGpuMctpVdm.cpp NvidiaGpuMctpVdm.hpp NvidiaPcieDevice.cpp NvidiaPcieDevice.hpp OcpMctpVdm.cpp OcpMctpVdm.hpp meson.build /openbmc/dbus-sensors/src/tests/test_NvidiaGpuSensorTest.cpp
1180ed47	30-Sep-2025	Harshit Aghera <haghera@nvidia.com>	nvidia-gpu: add support for PCIe port metrics Add xyz.openbmc_project.Metric.Value interface for each of the following PCIe port metric of a ConnectX device. PCIeErrors.CorrectableErrorCount PCIeEr nvidia-gpu: add support for PCIe port metrics Add xyz.openbmc_project.Metric.Value interface for each of the following PCIe port metric of a ConnectX device. PCIeErrors.CorrectableErrorCount PCIeErrors.NonFatalErrorCount PCIeErrors.FatalErrorCount PCIeErrors.L0ToRecoveryCount PCIeErrors.ReplayCount PCIeErrors.ReplayRolloverCount PCIeErrors.NAKSentCount PCIeErrors.NAKReceivedCount PCIeErrors.UnsupportedRequestCount PDI Patch - https://gerrit.openbmc.org/c/openbmc/phosphor-dbus-interfaces/+/84839 Tested: Build an image for nvl32-obmc machine with the following patch cherry picked. https://gerrit.openbmc.org/c/openbmc/openbmc/+/85490 The patch cherry-picks the following patches that are currently under review. ``` 1. device tree https://lore.kernel.org/all/aRbLqH8pLWCQryhu@molberding.nvidia.com/ 2. mctpd patches https://github.com/CodeConstruct/mctp/pull/85 3. u-boot changes https://lore.kernel.org/openbmc/20251121-msx4-v1-0-fc0118b666c1@nvidia.com/T/#t 4. kernel changes as specified in the openbmc patch (for espi) 5. entity-manager changes https://gerrit.openbmc.org/c/openbmc/entity-manager/+/85455 6. platform-init changes https://gerrit.openbmc.org/c/openbmc/platform-init/+/85456 7. spi changes https://lore.kernel.org/all/20251121-w25q01jv_fixup-v1-1-3d175050db73@nvidia.com/ ``` ``` root@nvl32-obmc:~# busctl tree xyz.openbmc_project.GpuSensor `- /xyz `- /xyz/openbmc_project \|- /xyz/openbmc_project/inventory \| \|- /xyz/openbmc_project/inventory/Nvidia_ConnectX_0_PCIe \| \| \|- /xyz/openbmc_project/inventory/Nvidia_ConnectX_0_PCIe/DOWN_0 \| \| \|- /xyz/openbmc_project/inventory/Nvidia_ConnectX_0_PCIe/DOWN_1 \| \| `- /xyz/openbmc_project/inventory/Nvidia_ConnectX_0_PCIe/UP_0 \| \|- /xyz/openbmc_project/inventory/Nvidia_ConnectX_2_PCIe \| \| \|- /xyz/openbmc_project/inventory/Nvidia_ConnectX_2_PCIe/DOWN_0 \| \| \|- /xyz/openbmc_project/inventory/Nvidia_ConnectX_2_PCIe/DOWN_1 \| \| `- /xyz/openbmc_project/inventory/Nvidia_ConnectX_2_PCIe/UP_0 \| `- /xyz/openbmc_project/inventory/Nvidia_ConnectX_3_PCIe \| \|- /xyz/openbmc_project/inventory/Nvidia_ConnectX_3_PCIe/DOWN_0 \| \|- /xyz/openbmc_project/inventory/Nvidia_ConnectX_3_PCIe/DOWN_1 \| `- /xyz/openbmc_project/inventory/Nvidia_ConnectX_3_PCIe/UP_0 \|- /xyz/openbmc_project/metric \| \|- /xyz/openbmc_project/metric/port_Nvidia_ConnectX_0_PCIe_DOWN_0 \| \| `- /xyz/openbmc_project/metric/port_Nvidia_ConnectX_0_PCIe_DOWN_0/pcie \| \| \|- /xyz/openbmc_project/metric/port_Nvidia_ConnectX_0_PCIe_DOWN_0/pcie/correctable_error_count \| \| \|- /xyz/openbmc_project/metric/port_Nvidia_ConnectX_0_PCIe_DOWN_0/pcie/fatal_error_count \| \| \|- /xyz/openbmc_project/metric/port_Nvidia_ConnectX_0_PCIe_DOWN_0/pcie/l0_to_recovery_count \| \| \|- /xyz/openbmc_project/metric/port_Nvidia_ConnectX_0_PCIe_DOWN_0/pcie/nak_received_count \| \| \|- /xyz/openbmc_project/metric/port_Nvidia_ConnectX_0_PCIe_DOWN_0/pcie/nak_sent_count \| \| \|- /xyz/openbmc_project/metric/port_Nvidia_ConnectX_0_PCIe_DOWN_0/pcie/non_fatal_error_count \| \| \|- /xyz/openbmc_project/metric/port_Nvidia_ConnectX_0_PCIe_DOWN_0/pcie/replay_count \| \| \|- /xyz/openbmc_project/metric/port_Nvidia_ConnectX_0_PCIe_DOWN_0/pcie/replay_rollover_count \| \| `- /xyz/openbmc_project/metric/port_Nvidia_ConnectX_0_PCIe_DOWN_0/pcie/unsupported_request_count root@nvl32-obmc:~# busctl introspect xyz.openbmc_project.GpuSensor /xyz/openbmc_project/metric/port_Nvidia_ConnectX_3_PCIe_DOWN_1/pcie/l0_to_recovery_count NAME TYPE SIGNATURE RESULT/VALUE FLAGS org.freedesktop.DBus.Introspectable interface - - - .Introspect method - s - org.freedesktop.DBus.Peer interface - - - .GetMachineId method - s - .Ping method - - - org.freedesktop.DBus.Properties interface - - - .Get method ss v - .GetAll method s a{sv} - .Set method ssv - - .PropertiesChanged signal sa{sv}as - - xyz.openbmc_project.Association.Definitions interface - - - .Associations property a(sss) 1 "measuring" "measured_by" "/xyz/ope... emits-change xyz.openbmc_project.Metric.Value interface - - - .Unit property s "xyz.openbmc_project.Metric.Value.Uni... emits-change .Value property d 1 emits-change ``` Change-Id: I3379c09346653d6a6bf2921bf765f0adf5a22098 Signed-off-by: Harshit Aghera <haghera@nvidia.com> show more ... /openbmc/dbus-sensors/src/mctp/MCTPDeviceRepository.hpp /openbmc/dbus-sensors/src/mctp/MCTPEndpoint.cpp /openbmc/dbus-sensors/src/mctp/MCTPEndpoint.hpp /openbmc/dbus-sensors/src/mctp/MCTPReactor.cpp /openbmc/dbus-sensors/src/mctp/MCTPReactor.hpp NvidiaGpuSensorMain.cpp NvidiaPcieDevice.cpp NvidiaPcieDevice.hpp NvidiaPciePortMetrics.cpp NvidiaPciePortMetrics.hpp NvidiaUtils.hpp meson.build /openbmc/dbus-sensors/src/tests/test_MCTPReactor.cpp
b341fa2b	02-Dec-2025	Harshit Aghera <haghera@nvidia.com>	nvidia-gpu: enable gpu software inventory The patch uses the MCTP VDM command to retrieve the GPU driver version and updates the DBus interface xyz.openbmc_project.Software.Version with this informa nvidia-gpu: enable gpu software inventory The patch uses the MCTP VDM command to retrieve the GPU driver version and updates the DBus interface xyz.openbmc_project.Software.Version with this information at DBus object path /xyz/openbmc_project/software/. The patch also associates software inventory to the chassis inventory item. The GPU driver version is made available in Redfish at the URI /redfish/v1/UpdateService/FirmwareInventory/. Tested: Build an image for nvl32-obmc machine with the following patches cherry picked. https://gerrit.openbmc.org/c/openbmc/openbmc/+/85490 The patch cherry-picks the following patches that are currently under review. ``` 1. device tree https://lore.kernel.org/all/aRbLqH8pLWCQryhu@molberding.nvidia.com/ 2. mctpd patches https://github.com/CodeConstruct/mctp/pull/85 3. u-boot changes https://lore.kernel.org/openbmc/20251121-msx4-v1-0-fc0118b666c1@nvidia.com/T/#t 4. kernel changes as specified in the openbmc patch (for espi) 5. entity-manager changes https://gerrit.openbmc.org/c/openbmc/entity-manager/+/85455 6. platform-init changes https://gerrit.openbmc.org/c/openbmc/platform-init/+/85456 7. spi changes https://lore.kernel.org/all/20251121-w25q01jv_fixup-v1-1-3d175050db73@nvidia.com/ ``` The GPU driver version shows up on the DBus. Change-Id: I712fe0952a02f36e386d3f37a5d4a8192ba641de Signed-off-by: Harshit Aghera <haghera@nvidia.com> show more ... /openbmc/dbus-sensors/src/PwmSensor.cpp NvidiaDriverInformation.cpp NvidiaDriverInformation.hpp NvidiaGpuDevice.cpp NvidiaGpuDevice.hpp NvidiaGpuMctpVdm.cpp NvidiaGpuMctpVdm.hpp NvidiaGpuSensorMain.cpp meson.build /openbmc/dbus-sensors/src/nvme/NVMeBasicContext.cpp /openbmc/dbus-sensors/src/nvme/NVMeSensor.cpp /openbmc/dbus-sensors/src/nvme/NVMeSensor.hpp /openbmc/dbus-sensors/src/nvme/NVMeSensorMain.cpp /openbmc/dbus-sensors/src/tests/test_NvidiaGpuSensorTest.cpp
68a8e2dd	29-Sep-2025	Harshit Aghera <haghera@nvidia.com>	nvidia-gpu: add support for PCIe port telemetry Add xyz.openbmc_project.Inventory.Connector.Port Interface for each PCIe port of a ConnectX device. PDI patches to extend the xyz.openbmc_project.Inv nvidia-gpu: add support for PCIe port telemetry Add xyz.openbmc_project.Inventory.Connector.Port Interface for each PCIe port of a ConnectX device. PDI patches to extend the xyz.openbmc_project.Inventory.Connector.Port Interface - https://gerrit.openbmc.org/c/openbmc/phosphor-dbus-interfaces/+/84653 https://gerrit.openbmc.org/c/openbmc/phosphor-dbus-interfaces/+/84652 Tested: Build an image for nvl32-obmc machine with the following patch cherry picked. https://gerrit.openbmc.org/c/openbmc/openbmc/+/85490 The patch cherry-picks the following patches that are currently under review. ``` 1. device tree https://lore.kernel.org/all/aRbLqH8pLWCQryhu@molberding.nvidia.com/ 2. mctpd patches https://github.com/CodeConstruct/mctp/pull/85 3. u-boot changes https://lore.kernel.org/openbmc/20251121-msx4-v1-0-fc0118b666c1@nvidia.com/T/#t 4. kernel changes as specified in the openbmc patch (for espi) 5. entity-manager changes https://gerrit.openbmc.org/c/openbmc/entity-manager/+/85455 6. platform-init changes https://gerrit.openbmc.org/c/openbmc/platform-init/+/85456 7. spi changes https://lore.kernel.org/all/20251121-w25q01jv_fixup-v1-1-3d175050db73@nvidia.com/ ``` ``` root@nvl32-obmc:~# busctl tree xyz.openbmc_project.GpuSensor `- /xyz `- /xyz/openbmc_project \|- /xyz/openbmc_project/inventory \| `- /xyz/openbmc_project/inventory/pcie_devices \| \|- /xyz/openbmc_project/inventory/pcie_devices/Nvidia_ConnectX_0 \| \| \|- /xyz/openbmc_project/inventory/pcie_devices/Nvidia_ConnectX_0/DOWN_0 \| \| \|- /xyz/openbmc_project/inventory/pcie_devices/Nvidia_ConnectX_0/DOWN_1 \| \| `- /xyz/openbmc_project/inventory/pcie_devices/Nvidia_ConnectX_0/UP_0 \| \|- /xyz/openbmc_project/inventory/pcie_devices/Nvidia_ConnectX_1 \| \| \|- /xyz/openbmc_project/inventory/pcie_devices/Nvidia_ConnectX_1/DOWN_0 \| \| \|- /xyz/openbmc_project/inventory/pcie_devices/Nvidia_ConnectX_1/DOWN_1 \| \| `- /xyz/openbmc_project/inventory/pcie_devices/Nvidia_ConnectX_1/UP_0 \| \|- /xyz/openbmc_project/inventory/pcie_devices/Nvidia_ConnectX_2 \| \| \|- /xyz/openbmc_project/inventory/pcie_devices/Nvidia_ConnectX_2/DOWN_0 \| \| \|- /xyz/openbmc_project/inventory/pcie_devices/Nvidia_ConnectX_2/DOWN_1 \| \| `- /xyz/openbmc_project/inventory/pcie_devices/Nvidia_ConnectX_2/UP_0 \| `- /xyz/openbmc_project/inventory/pcie_devices/Nvidia_ConnectX_3 \| \|- /xyz/openbmc_project/inventory/pcie_devices/Nvidia_ConnectX_3/DOWN_0 \| \|- /xyz/openbmc_project/inventory/pcie_devices/Nvidia_ConnectX_3/DOWN_1 \| `- /xyz/openbmc_project/inventory/pcie_devices/Nvidia_ConnectX_3/UP_0 `- /xyz/openbmc_project/sensors root@nvl32-obmc:~# busctl -l introspect xyz.openbmc_project.GpuSensor /xyz/openbmc_project/inventory/pcie_devices/Nvidia_ConnectX_1/DOWN_0 NAME TYPE SIGNATURE RESULT/VALUE FLAGS org.freedesktop.DBus.Introspectable interface - - - .Introspect method - s - org.freedesktop.DBus.Peer interface - - - .GetMachineId method - s - .Ping method - - - org.freedesktop.DBus.Properties interface - - - .Get method ss v - .GetAll method s a{sv} - .Set method ssv - - .PropertiesChanged signal sa{sv}as - - xyz.openbmc_project.Association.Definitions interface - - - .Associations property a(sss) 1 "connected_to" "connecting" "/xyz/openbmc_project/inventory/pcie_devices/Nvidia_ConnectX_1" emits-change xyz.openbmc_project.Inventory.Connector.Port interface - - - .PortProtocol property s "xyz.openbmc_project.Inventory.Connector.Port.PortProtocol.PCIe" emits-change .PortType property s "xyz.openbmc_project.Inventory.Connector.Port.PortType.DownstreamPort" emits-change .Speed property t 34359738368 emits-change .Width property u 16 emits-change ``` Change-Id: I2845f090ac92c8ff6a742ec83c23073e6ea4e1b6 Signed-off-by: Harshit Aghera <haghera@nvidia.com> show more ... /openbmc/dbus-sensors/src/SensorPaths.cpp /openbmc/dbus-sensors/src/SensorPaths.hpp /openbmc/dbus-sensors/src/external/ExternalSensor.cpp /openbmc/dbus-sensors/src/external/ExternalSensorMain.cpp /openbmc/dbus-sensors/src/fan/FanMain.cpp /openbmc/dbus-sensors/src/leakdetector/LeakDetectionManager.cpp /openbmc/dbus-sensors/src/leakdetector/LeakGPIODetector.cpp /openbmc/dbus-sensors/src/leakdetector/LeakGPIODetector.hpp /openbmc/dbus-sensors/src/mctp/MCTPReactor.cpp NvidiaGpuMctpVdm.cpp NvidiaGpuMctpVdm.hpp NvidiaPcieDevice.cpp NvidiaPcieDevice.hpp NvidiaPciePort.cpp NvidiaPciePort.hpp meson.build /openbmc/dbus-sensors/src/psu/PSUSensorMain.cpp /openbmc/dbus-sensors/src/tests/OWNERS /openbmc/dbus-sensors/src/tests/test_MCTPReactor.cpp /openbmc/dbus-sensors/src/tests/test_NvidiaGpuSensorTest.cpp
6ef89739	21-Oct-2025	Ed Tanous <etanous@nvidia.com>	nvidia-gpu: Use common class for mctp endpoints The common endpoint class should be used for send and receive. Tested: Used by MctpRequester, tested on nvl32-obmc Change-Id: I0060a66a5bcb4decfbe66 nvidia-gpu: Use common class for mctp endpoints The common endpoint class should be used for send and receive. Tested: Used by MctpRequester, tested on nvl32-obmc Change-Id: I0060a66a5bcb4decfbe663d46ba88529e01e2209 Signed-off-by: Ed Tanous <etanous@nvidia.com> show more ... /openbmc/dbus-sensors/src/MctpAsioEndpoint.cpp /openbmc/dbus-sensors/src/MctpAsioEndpoint.hpp /openbmc/dbus-sensors/src/adc/meson.build /openbmc/dbus-sensors/src/cable-monitor/meson.build /openbmc/dbus-sensors/src/exit-air/meson.build /openbmc/dbus-sensors/src/external/meson.build /openbmc/dbus-sensors/src/fan/meson.build /openbmc/dbus-sensors/src/hwmon-temp/meson.build /openbmc/dbus-sensors/src/intel-cpu/meson.build /openbmc/dbus-sensors/src/intrusion/meson.build /openbmc/dbus-sensors/src/ipmb/meson.build /openbmc/dbus-sensors/src/leakdetector/meson.build /openbmc/dbus-sensors/src/mctp/meson.build /openbmc/dbus-sensors/src/mcu/meson.build /openbmc/dbus-sensors/src/meson.build MctpRequester.cpp meson.build /openbmc/dbus-sensors/src/nvme/meson.build /openbmc/dbus-sensors/src/psu/meson.build /openbmc/dbus-sensors/src/smbpbi/meson.build /openbmc/dbus-sensors/src/tests/meson.build
964057d1	17-Nov-2025	George Liu <liuxiwei@ieisystem.com>	Remove redundant is_method_error() checks The handlers registered through sdbusplus::bus::match_t only receive D-Bus signals. Signal messages are never sent as method-error replies, and therefore me Remove redundant is_method_error() checks The handlers registered through sdbusplus::bus::match_t only receive D-Bus signals. Signal messages are never sent as method-error replies, and therefore message.is_method_error() can never be true in these callbacks. This change removes all unnecessary is_method_error() checks from signal handlers to simplify the code and avoid confusion. Change-Id: I43e4a564c1bf401a5da9819dd201464e4a59c871 Signed-off-by: George Liu <liuxiwei@ieisystem.com> show more ... /openbmc/dbus-sensors/src/adc/ADCSensorMain.cpp /openbmc/dbus-sensors/src/external/ExternalSensorMain.cpp /openbmc/dbus-sensors/src/fan/FanMain.cpp /openbmc/dbus-sensors/src/hwmon-temp/HwmonTempMain.cpp /openbmc/dbus-sensors/src/intel-cpu/IntelCPUSensorMain.cpp /openbmc/dbus-sensors/src/intrusion/IntrusionSensorMain.cpp /openbmc/dbus-sensors/src/ipmb/IpmbSensor.cpp NvidiaDeviceDiscovery.cpp /openbmc/dbus-sensors/src/nvme/NVMeSensorMain.cpp /openbmc/dbus-sensors/src/psu/PSUSensorMain.cpp
33ba62c7	07-Nov-2025	Harshit Aghera <haghera@nvidia.com>	request maintainer role for nvidia-gpu I have actively contributed to and reviewed patches for nvidia-gpu application since its inception in May 2025. Additionally, I have contributed and reviewed p request maintainer role for nvidia-gpu I have actively contributed to and reviewed patches for nvidia-gpu application since its inception in May 2025. Additionally, I have contributed and reviewed phosphor-dbus-interfaces and bmcweb patches related to nvidia-gpu application. Change-Id: I8eca227699b09c5cdb49495d5237a545c8609e86 Signed-off-by: Harshit Aghera <haghera@nvidia.com> show more ... /openbmc/dbus-sensors/src/hwmon-temp/HwmonTempMain.cpp OWNERS
77239da5	24-Nov-2025	Ed Tanous <etanous@nvidia.com>	Fix test build if nvidia-gpu is disabled When nvidia-gpu is disabled, unit tests don't build because of the shared gpusensor_sources variable. Make a quick fix to fix the build. Going forward we ma Fix test build if nvidia-gpu is disabled When nvidia-gpu is disabled, unit tests don't build because of the shared gpusensor_sources variable. Make a quick fix to fix the build. Going forward we may need the option checking to be put into the sub meson file rather than the top level, so that unit test deps can build separately. Change-Id: Ib87487fe15e80df44afbd9c3421163c6fbc16f74 Signed-off-by: Ed Tanous <etanous@nvidia.com> show more ... /openbmc/dbus-sensors/src/cable-monitor/OWNERS /openbmc/dbus-sensors/src/hwmon-temp/HwmonTempMain.cpp /openbmc/dbus-sensors/src/leakdetector/OWNERS /openbmc/dbus-sensors/src/meson.build meson.build
064e6ff7	27-Oct-2025	Deepak Kodihalli <deepak.kodihalli.83@gmail.com>	nvidia-gpu: fix GPU power PeakReading PDI usage The GPU power peak reading, which uses the Telemetry.Report PDI, was relying on a string ("PeakReading") to expose the reading. This string is Redfish nvidia-gpu: fix GPU power PeakReading PDI usage The GPU power peak reading, which uses the Telemetry.Report PDI, was relying on a string ("PeakReading") to expose the reading. This string is Redfish specific. Instead, use the OperationType.Maximum enum defined in the PDI. Bmcweb code can map this to PeakReading. Tested: Build an image for nvl32-obmc machine with the following patches cherry picked. https://gerrit.openbmc.org/c/openbmc/openbmc/+/85490 https://gerrit.openbmc.org/c/openbmc/bmcweb/+/82449. The patch cherry-picks the following patches that are currently under review. ``` 1. device tree https://lore.kernel.org/all/aRbLqH8pLWCQryhu@molberding.nvidia.com/ 2. mctpd patches https://github.com/CodeConstruct/mctp/pull/85 3. u-boot changes https://lore.kernel.org/openbmc/20251121-msx4-v1-0-fc0118b666c1@nvidia.com/T/#t 4. kernel changes as specified in the openbmc patch (for espi) 5. entity-manager changes https://gerrit.openbmc.org/c/openbmc/entity-manager/+/85455 6. platform-init changes https://gerrit.openbmc.org/c/openbmc/platform-init/+/85456 7. spi changes https://lore.kernel.org/all/20251121-w25q01jv_fixup-v1-1-3d175050db73@nvidia.com/ ``` The GPU Power PeakReading is correctly reported on DBus and on redfish. Change-Id: I39b2b4987d845f878ffdedcfdb02cdfdc02a4499 Signed-off-by: Deepak Kodihalli <deepak.kodihalli.83@gmail.com> Signed-off-by: Harshit Aghera <haghera@nvidia.com> show more ... NvidiaGpuPowerPeakReading.cpp
e0b80e1e	28-Aug-2025	Harshit Aghera <haghera@nvidia.com>	nvidia-gpu: add support for ConnectX device Add support to discover ConnectX devices and to populate PCIe interface properties using Phosphor DBus Interface xyz.openbmc_project.Inventory.Item.PCIeDe nvidia-gpu: add support for ConnectX device Add support to discover ConnectX devices and to populate PCIe interface properties using Phosphor DBus Interface xyz.openbmc_project.Inventory.Item.PCIeDevice. ConnectX device has an integrated PCIe Switch. The patch uses xyz.openbmc_project.Inventory.Item.PCIeSwitch PDI to define the PCIe Switch resource. Tested: Build an image for nvl32-obmc machine with the following patch cherry picked. https://gerrit.openbmc.org/c/openbmc/openbmc/+/85490 The patch cherry-picks the following patches that are currently under review. ``` 1. device tree https://lore.kernel.org/all/aRbLqH8pLWCQryhu@molberding.nvidia.com/ 2. mctpd patches https://github.com/CodeConstruct/mctp/pull/85 3. u-boot changes https://lore.kernel.org/openbmc/20251121-msx4-v1-0-fc0118b666c1@nvidia.com/T/#t 4. kernel changes as specified in the openbmc patch (for espi) 5. entity-manager changes https://gerrit.openbmc.org/c/openbmc/entity-manager/+/85455 6. platform-init changes https://gerrit.openbmc.org/c/openbmc/platform-init/+/85456 7. spi changes https://lore.kernel.org/all/20251121-w25q01jv_fixup-v1-1-3d175050db73@nvidia.com/ ``` ``` root@nvl32-bmc:~# busctl tree xyz.openbmc_project.GpuSensor `- /xyz `- /xyz/openbmc_project \|- /xyz/openbmc_project/inventory \| `- /xyz/openbmc_project/inventory/pcie_devices \| \|- /xyz/openbmc_project/inventory/pcie_devices/Nvidia_ConnectX_0 \| \|- /xyz/openbmc_project/inventory/pcie_devices/Nvidia_ConnectX_1 \| \|- /xyz/openbmc_project/inventory/pcie_devices/Nvidia_ConnectX_2 \| `- /xyz/openbmc_project/inventory/pcie_devices/Nvidia_ConnectX_3 root@nvl32-obmc:~# busctl introspect xyz.openbmc_project.GpuSensor /xyz/openbmc_project/inventory/pcie_devices/Nvidia_ConnectX_0 NAME TYPE SIGNATURE RESULT/VALUE FLAGS org.freedesktop.DBus.Introspectable interface - - - .Introspect method - s - org.freedesktop.DBus.Peer interface - - - .GetMachineId method - s - .Ping method - - - org.freedesktop.DBus.Properties interface - - - .Get method ss v - .GetAll method s a{sv} - .Set method ssv - - .PropertiesChanged signal sa{sv}as - - xyz.openbmc_project.Inventory.Item.PCIeDevice interface - - - .GenerationInUse property s "xyz.openbmc_project.Inventory.Item.P... emits-change .GenerationSupported property s "xyz.openbmc_project.Inventory.Item.P... emits-change .LanesInUse property u 8 emits-change .MaxLanes property u 16 emits-change xyz.openbmc_project.Inventory.Item.PCIeSwitch interface - - - $ curl -s -k -u 'root:0penBmc' https://${bmc_ip}/redfish/v1/Systems/system/PCIeDevices/Nvidia_ConnectX_0 { "@odata.id": "/redfish/v1/Systems/system/PCIeDevices/Nvidia_ConnectX_0", "@odata.type": "#PCIeDevice.v1_19_0.PCIeDevice", "Id": "Nvidia_ConnectX_0", "Name": "PCIe Device", "PCIeFunctions": { "@odata.id": "/redfish/v1/Systems/system/PCIeDevices/Nvidia_ConnectX_0/PCIeFunctions" }, "PCIeInterface": { "LanesInUse": 8, "MaxLanes": 16, "MaxPCIeType": "Gen5", "PCIeType": "Gen5" }, "Status": { "Health": "OK", "State": "Enabled" } }% ``` Change-Id: Id89ce8a298ebb16934e94efcb9ca4679f91a7b26 Signed-off-by: Harshit Aghera <haghera@nvidia.com> show more ... /openbmc/dbus-sensors/src/adc/meson.build /openbmc/dbus-sensors/src/cable-monitor/meson.build /openbmc/dbus-sensors/src/exit-air/meson.build /openbmc/dbus-sensors/src/external/meson.build /openbmc/dbus-sensors/src/fan/meson.build /openbmc/dbus-sensors/src/hwmon-temp/meson.build /openbmc/dbus-sensors/src/intel-cpu/meson.build /openbmc/dbus-sensors/src/intrusion/meson.build /openbmc/dbus-sensors/src/ipmb/meson.build /openbmc/dbus-sensors/src/leakdetector/meson.build /openbmc/dbus-sensors/src/mctp/MCTPDeviceRepository.hpp /openbmc/dbus-sensors/src/mctp/MCTPReactor.cpp /openbmc/dbus-sensors/src/mcu/meson.build /openbmc/dbus-sensors/src/meson.build NvidiaDeviceDiscovery.cpp NvidiaDeviceDiscovery.hpp NvidiaGpuMctpVdm.cpp NvidiaGpuMctpVdm.hpp NvidiaGpuSensorMain.cpp NvidiaPcieDevice.cpp NvidiaPcieDevice.hpp NvidiaPcieInterface.cpp NvidiaPcieInterface.hpp meson.build /openbmc/dbus-sensors/src/nvme/meson.build /openbmc/dbus-sensors/src/psu/meson.build /openbmc/dbus-sensors/src/smbpbi/meson.build /openbmc/dbus-sensors/src/tests/meson.build
db74edb9	29-Sep-2025	Ed Tanous <etanous@nvidia.com>	nvidia-gpu: move unused member Move MaxMessageSize to where it's used Change-Id: I6c45157e6e3e52672cab86c82af1ea45a3628d19 Signed-off-by: Ed Tanous <etanous@nvidia.com> MctpRequester.hpp
779d84f0	29-Sep-2025	Ed Tanous <etanous@nvidia.com>	nvidia-gpu: Declare send endpoint on stack There's no reason to store this small class in between transactions. Just construct on stack as part of the send. Tested: On last patchset in series Chan nvidia-gpu: Declare send endpoint on stack There's no reason to store this small class in between transactions. Just construct on stack as part of the send. Tested: On last patchset in series Change-Id: I00090942665f022bfa2552b9c31c7c3da000646b Signed-off-by: Ed Tanous <etanous@nvidia.com> show more ... /openbmc/dbus-sensors/src/cable-monitor/CableMonitor.cpp MctpRequester.cpp MctpRequester.hpp /openbmc/dbus-sensors/src/psu/PSUSensorMain.cpp /openbmc/dbus-sensors/src/smbpbi/SmbpbiSensor.cpp /openbmc/dbus-sensors/src/smbpbi/SmbpbiSensor.hpp
b5e823f7	09-Oct-2025	Ed Tanous <ed@tanous.net>	Change copyright to match linux foundation We should use SPDX identifiers wherever possible for simplification. Change-Id: If3a7bfe506d7fded64a3ac929cc643834b16303e Signed-off-by: Ed Tanous <etanou Change copyright to match linux foundation We should use SPDX identifiers wherever possible for simplification. Change-Id: If3a7bfe506d7fded64a3ac929cc643834b16303e Signed-off-by: Ed Tanous <etanous@nvidia.com> show more ... MctpRequester.cpp MctpRequester.hpp NvidiaDeviceDiscovery.cpp NvidiaDeviceDiscovery.hpp NvidiaGpuDevice.cpp NvidiaGpuDevice.hpp NvidiaGpuEnergySensor.cpp NvidiaGpuEnergySensor.hpp NvidiaGpuMctpVdm.cpp NvidiaGpuMctpVdm.hpp NvidiaGpuPowerPeakReading.cpp NvidiaGpuPowerPeakReading.hpp NvidiaGpuPowerSensor.cpp NvidiaGpuPowerSensor.hpp NvidiaGpuSensor.cpp NvidiaGpuSensor.hpp NvidiaGpuSensorMain.cpp NvidiaGpuVoltageSensor.cpp NvidiaGpuVoltageSensor.hpp NvidiaSmaDevice.cpp NvidiaSmaDevice.hpp OcpMctpVdm.cpp OcpMctpVdm.hpp /openbmc/dbus-sensors/src/smbpbi/SmbpbiSensor.cpp /openbmc/dbus-sensors/src/smbpbi/SmbpbiSensor.hpp /openbmc/dbus-sensors/src/tests/test_NvidiaGpuSensorTest.cpp
3f6bc731	23-Jul-2025	Harshit Aghera <haghera@nvidia.com>	nvidia-gpu: add TLimit sensor properties Add support for DMTF Redfish properties ReadingBasis and Implementation for GPU TLimit sensor [1]. Property Implementation for TLimit is set to Synthesized nvidia-gpu: add TLimit sensor properties Add support for DMTF Redfish properties ReadingBasis and Implementation for GPU TLimit sensor [1]. Property Implementation for TLimit is set to Synthesized because the GPU incorporates intelligent logic that determines the temperature delta from the first thermal management software slowdown event. TLimit is derived from other reported GPU sensors, such as HBM, Tavg, and others. DBus Interface definition - https://gerrit.openbmc.org/c/openbmc/phosphor-dbus-interfaces/+/81658 Tested: Build an image for gb200nvl-obmc machine with the following patches cherry picked. This patches are needed to enable the mctp stack. https://gerrit.openbmc.org/c/openbmc/openbmc/+/79422 ``` > curl -s -k -u 'root:0penBmc' https://10.137.203.137/redfish/v1/Chassis/NVIDIA_GB200_1/Sensors/temperature_NVIDIA_GB200_GPU_0_TEMP_1 { "@odata.id": "/redfish/v1/Chassis/NVIDIA_GB200_1/Sensors/temperature_NVIDIA_GB200_GPU_0_TEMP_1", "@odata.type": "#Sensor.v1_2_0.Sensor", "Description": "Thermal Limit(TLIMIT) Temperature is the distance in deg C from the GPU temperature to the first throttle limit.", "Id": "temperature_NVIDIA_GB200_GPU_0_TEMP_1", "Implementation": "Synthesized", "Name": "NVIDIA GB200 GPU 0 TEMP 1", "Reading": 56.59375, "ReadingBasis": "Headroom", "ReadingRangeMax": 127.0, "ReadingRangeMin": -128.0, "ReadingType": "Temperature", "ReadingUnits": "Cel", "Status": { "Health": "OK", "State": "Enabled" } }% root@gb200nvl-obmc:~# busctl introspect xyz.openbmc_project.GpuSensor /xyz/openbmc_project/sensors/temperature/NVIDIA_GB200_GPU_0_TEMP_1 NAME TYPE SIGNATURE RESULT/VALUE FLAGS org.freedesktop.DBus.Introspectable interface - - - .Introspect method - s - org.freedesktop.DBus.Peer interface - - - .GetMachineId method - s - .Ping method - - - org.freedesktop.DBus.Properties interface - - - .Get method ss v - .GetAll method s a{sv} - .Set method ssv - - .PropertiesChanged signal sa{sv}as - - xyz.openbmc_project.Association.Definitions interface - - - .Associations property a(sss) 1 "chassis" "all_sensors" "/xyz/openb... emits-change xyz.openbmc_project.Inventory.Item interface - - - .PrettyName property s "Thermal Limit(TLIMIT) Temperature is... emits-change xyz.openbmc_project.Sensor.Type interface - - - .Implementation property s "xyz.openbmc_project.Sensor.Type.Impl... emits-change .ReadingBasis property s "xyz.openbmc_project.Sensor.Type.Read... emits-change xyz.openbmc_project.Sensor.Value interface - - - .MaxValue property d 127 emits-change .MinValue property d -128 emits-change .Unit property s "xyz.openbmc_project.Sensor.Value.Uni... emits-change .Value property d 56.6836 emits-change writable xyz.openbmc_project.Sensor.ValueMutability interface - - - .Mutable property b true emits-change xyz.openbmc_project.State.Decorator.Availability interface - - - .Available property b true emits-change writable xyz.openbmc_project.State.Decorator.OperationalStatus interface - - - .Functional property b true emits-change ``` [1] : https://redfish.dmtf.org/schemas/v1/Sensor.v1_11_0.yaml Change-Id: I1a16ced44c563794d561d26232a5e5fba041b875 Signed-off-by: Harshit Aghera <haghera@nvidia.com> show more ... /openbmc/dbus-sensors/src/exit-air/ExitAirTempSensor.cpp NvidiaGpuSensor.cpp NvidiaGpuSensor.hpp /openbmc/dbus-sensors/src/psu/PSUSensorMain.cpp
1851f645	29-Sep-2025	Marc Olberding <molberding@nvidia.com>	nvidia-gpu: Fix thresholds for GPU_TEMP_1 Fixes thresholds for GPU_TEMP_1 to be upper critical, warning, shutdown. Rather than lower critical, et al. Change-Id: I580766288f3d27a48c75f00ea1dab13f028 nvidia-gpu: Fix thresholds for GPU_TEMP_1 Fixes thresholds for GPU_TEMP_1 to be upper critical, warning, shutdown. Rather than lower critical, et al. Change-Id: I580766288f3d27a48c75f00ea1dab13f0284bed6 Signed-off-by: Marc Olberding <molberding@nvidia.com> show more ... NvidiaGpuDevice.hpp
fd4a3779	24-Sep-2025	Marc Olberding <molberding@nvidia.com>	nvidia-gpu: Fix a number of object lifetime issues Moves all subsensors and objects treated as shared_ptrs to be using shared_from_this. This way, if there's an object lifetime issue we don't segfau nvidia-gpu: Fix a number of object lifetime issues Moves all subsensors and objects treated as shared_ptrs to be using shared_from_this. This way, if there's an object lifetime issue we don't segfault. Also separates construction and asio init for NvidiaSmaDevice so that when we bind to this, its valid after we leave the ctor Change-Id: I8e3115bc276d2e0eaac0b1dc9a9d2c46e6751d4b Signed-off-by: Marc Olberding <molberding@nvidia.com> show more ... Inventory.cpp NvidiaDeviceDiscovery.cpp NvidiaGpuDevice.cpp NvidiaGpuEnergySensor.cpp NvidiaGpuEnergySensor.hpp NvidiaGpuPowerSensor.cpp NvidiaGpuPowerSensor.hpp NvidiaGpuSensor.cpp NvidiaGpuVoltageSensor.cpp NvidiaGpuVoltageSensor.hpp NvidiaSmaDevice.cpp NvidiaSmaDevice.hpp
6282a452	29-Sep-2025	Marc Olberding <molberding@nvidia.com>	nvidia-gpu: NvidiaGpuDevice fix use after free Fixes use after free for NvidiaGpuThresholds. Moves the storage used for communication to be part of the NvidiaGpuDevice class instead of ephemerally p nvidia-gpu: NvidiaGpuDevice fix use after free Fixes use after free for NvidiaGpuThresholds. Moves the storage used for communication to be part of the NvidiaGpuDevice class instead of ephemerally passed around through free functions Also makes NvidiaGpuDevice inherit from std::enable_shared_from_this Testing: Issue found previous was coredumps on nvl32-obmc. Asan discovered it was a use after free in the shared pointer in ThermalLimits Afterwards, no core dumps or issues reported by asan. Ran on an nvl32-obmc model with 8 GPU's Change-Id: I61b606f3a129499089718e7ec804926db5f22c64 Signed-off-by: Marc Olberding <molberding@nvidia.com> show more ... NvidiaGpuDevice.cpp NvidiaGpuDevice.hpp meson.build
ac920734	28-Sep-2025	Marc Olberding <molberding@nvidia.com>	nvidia-gpu: deferred init for NvidiaGpuDevice Adds deferred init for NvidiaGpuDevice, so that when we bind to this, the this pointer is valid, i.e. after construction is completed Change-Id: I24a53 nvidia-gpu: deferred init for NvidiaGpuDevice Adds deferred init for NvidiaGpuDevice, so that when we bind to this, the this pointer is valid, i.e. after construction is completed Change-Id: I24a53d2ab9be1a2a4431368414a154b48347d2a2 Signed-off-by: Marc Olberding <molberding@nvidia.com> show more ... Inventory.cpp Inventory.hpp NvidiaDeviceDiscovery.cpp NvidiaGpuDevice.cpp NvidiaGpuDevice.hpp
d0125c9c	08-Oct-2025	Marc Olberding <molberding@nvidia.com>	nvidia-gpu: Fix up buffering in MctpRequester This change does a lot, for better or worse 1. Change MctpRequester to hold both buffers for send and receive 2. This requires changing the callback str nvidia-gpu: Fix up buffering in MctpRequester This change does a lot, for better or worse 1. Change MctpRequester to hold both buffers for send and receive 2. This requires changing the callback structure, so the reach is far 3. Changes error reporting to be through std::error_code 4. Collapses the QueuingRequeuster and Requeuster to be MctpRequeuster 5. Doing 4 gets rid of a level indirection and an extra unordered_map 6. Adds proper iid support, which is made significantly easier by 4/5 7. Fixes issues around expiry timer's where we would cancel the timer for a given request whenever a new packet would come in to be sent. This could cause lockup if a packet truly did time out and an interleaved packet finished sending. This moves each queue to have its own timer. This fixes an issue where we were receiving buffers in from clients and then binding them to receive_calls without ensuring that they are the correct message, thus when receive was called, it was called with the last bound buffer to async_receive_from. This would cause a number of issues, ranging from incorrect device discovery results to core dumps as well as incorrect sensor readings. This change moves the receive and send buffers to be owned by the MctpRequester, and a non-owning view is provided via callback to the client. All existing clients just decode in place given that buffer. Tested: loaded onto nvl32-obmc. Correct number of sensors showed up and the readings were nominal Change-Id: I67c843691ca79e9fcccfa16df6d611918f25f6ca Signed-off-by: Marc Olberding <molberding@nvidia.com> show more ... /openbmc/dbus-sensors/src/MctpAsioEndpoint.hpp Inventory.cpp Inventory.hpp MctpRequester.cpp MctpRequester.hpp NvidiaDeviceDiscovery.cpp NvidiaGpuDevice.cpp NvidiaGpuEnergySensor.cpp NvidiaGpuEnergySensor.hpp NvidiaGpuPowerPeakReading.cpp NvidiaGpuPowerPeakReading.hpp NvidiaGpuPowerSensor.cpp NvidiaGpuPowerSensor.hpp NvidiaGpuSensor.cpp NvidiaGpuSensor.hpp NvidiaGpuSensorMain.cpp NvidiaGpuThresholds.cpp NvidiaGpuThresholds.hpp NvidiaGpuVoltageSensor.cpp NvidiaGpuVoltageSensor.hpp
6b712322	31-Jul-2025	Harshit Aghera <haghera@nvidia.com>	nvidia-gpu: add Power Sensor PeakReading Property Add support for Sensor Properties PeakReading and PeakRedingTime. Current Limitation - The ResetMetrics action is currently not supported for Redfi nvidia-gpu: add Power Sensor PeakReading Property Add support for Sensor Properties PeakReading and PeakRedingTime. Current Limitation - The ResetMetrics action is currently not supported for Redfish URIs in bmcweb. As a result, the ability to clear PeakReading values for GPU Power Sensors has not been implemented. Future Consideration - If ResetMetrics action support is added to bmcweb in the future, the corresponding functionality will also need to be implemented in the dbus-sensor application to ensure full compatibility. Tested: Build an image for gb200nvl-obmc machine with the following patches cherry picked. This patches are needed to enable the mctp stack. https://gerrit.openbmc.org/c/openbmc/openbmc/+/79422 ``` root@gb200nvl-obmc:~# busctl introspect xyz.openbmc_project.GpuSensor /xyz/openbmc_project/sensors/power/NVIDIA_GB200_GPU_0_Power_0 NAME TYPE SIGNATURE RESULT/VALUE FLAGS org.freedesktop.DBus.Introspectable interface - - - .Introspect method - s - org.freedesktop.DBus.Peer interface - - - .GetMachineId method - s - .Ping method - - - org.freedesktop.DBus.Properties interface - - - .Get method ss v - .GetAll method s a{sv} - .Set method ssv - - .PropertiesChanged signal sa{sv}as - - xyz.openbmc_project.Association.Definitions interface - - - .Associations property a(sss) 1 "chassis" "all_sensors" "/xyz/openb... emits-change xyz.openbmc_project.Sensor.Value interface - - - .MaxValue property d 5000 emits-change .MinValue property d 0 emits-change .Unit property s "xyz.openbmc_project.Sensor.Value.Uni... emits-change .Value property d 29.194 emits-change writable xyz.openbmc_project.Sensor.ValueMutability interface - - - .Mutable property b true emits-change xyz.openbmc_project.State.Decorator.Availability interface - - - .Available property b true emits-change writable xyz.openbmc_project.State.Decorator.OperationalStatus interface - - - .Functional property b true emits-change xyz.openbmc_project.Telemetry.Report interface - - - .Readings property (ta(ssdt)) 0 1 "PeakReading" "" 80.933 0 emits-change ``` Change-Id: I0a4f7eb0a5db688f32bf80954839140da9bb7e2a Signed-off-by: Harshit Aghera <haghera@nvidia.com> show more ... /openbmc/dbus-sensors/src/Thresholds.cpp /openbmc/dbus-sensors/src/Utils.hpp /openbmc/dbus-sensors/src/adc/ADCSensorMain.cpp /openbmc/dbus-sensors/src/exit-air/ExitAirTempSensor.cpp /openbmc/dbus-sensors/src/external/ExternalSensor.cpp /openbmc/dbus-sensors/src/external/ExternalSensorMain.cpp /openbmc/dbus-sensors/src/hwmon-temp/HwmonTempMain.cpp /openbmc/dbus-sensors/src/intel-cpu/IntelCPUSensor.cpp /openbmc/dbus-sensors/src/intel-cpu/IntelCPUSensorMain.cpp /openbmc/dbus-sensors/src/intrusion/ChassisIntrusionSensor.cpp /openbmc/dbus-sensors/src/intrusion/IntrusionSensorMain.cpp /openbmc/dbus-sensors/src/ipmb/IpmbSensor.cpp /openbmc/dbus-sensors/src/mcu/MCUTempSensor.cpp NvidiaGpuDevice.cpp NvidiaGpuDevice.hpp NvidiaGpuMctpVdm.cpp NvidiaGpuMctpVdm.hpp NvidiaGpuPowerPeakReading.cpp NvidiaGpuPowerPeakReading.hpp NvidiaGpuPowerSensor.cpp NvidiaGpuPowerSensor.hpp meson.build /openbmc/dbus-sensors/src/psu/PSUSensor.cpp /openbmc/dbus-sensors/src/psu/PSUSensorMain.cpp /openbmc/dbus-sensors/src/smbpbi/SmbpbiSensor.cpp /openbmc/dbus-sensors/src/tests/test_NvidiaGpuSensorTest.cpp
aba6fcac	29-Sep-2025	Ed Tanous <etanous@nvidia.com>	Fix tidy build This appears to be something tidy is wrong about. The suggestion of adding math to the struct initializers appears to not compile. Move the calculation of hysteresisTrigger and hyst Fix tidy build This appears to be something tidy is wrong about. The suggestion of adding math to the struct initializers appears to not compile. Move the calculation of hysteresisTrigger and hysteresisPublish into the constructor body itself to avoid the warning. Change-Id: I833fd12966c69c0e081692d6d40ba0cf1805ead1 Signed-off-by: Ed Tanous <etanous@nvidia.com> show more ... /openbmc/dbus-sensors/.gitignore /openbmc/dbus-sensors/meson.build /openbmc/dbus-sensors/src/intel-cpu/IntelCPUSensor.cpp /openbmc/dbus-sensors/src/intel-cpu/IntelCPUSensor.hpp NvidiaGpuPowerSensor.cpp NvidiaGpuPowerSensor.hpp /openbmc/dbus-sensors/src/psu/PSUSensorMain.cpp /openbmc/dbus-sensors/src/sensor.hpp /openbmc/dbus-sensors/subprojects/boost-meson.wrap /openbmc/dbus-sensors/subprojects/packagefiles/boost/meson.build /openbmc/dbus-sensors/subprojects/packagefiles/boost/subprojects/boost.wrap
87a0745b	03-Sep-2025	Ed Tanous <etanous@nvidia.com>	Move Nvidia gpu tests These tests got caught in the refactor. Move these tests to the correct location. Change-Id: Ie8ec10e154d60cb4f24e1f45be36240863438f87 Signed-off-by: Ed Tanous <etanous@nvidi Move Nvidia gpu tests These tests got caught in the refactor. Move these tests to the correct location. Change-Id: Ie8ec10e154d60cb4f24e1f45be36240863438f87 Signed-off-by: Ed Tanous <etanous@nvidia.com> show more ... meson.build /openbmc/dbus-sensors/src/tests/meson.build /openbmc/dbus-sensors/src/tests/test_NvidiaDeviceInventoryMctpVdm.cpp /openbmc/dbus-sensors/src/tests/test_NvidiaGpuSensorTest.cpp
6061bbcf	03-Sep-2025	Ed Tanous <etanous@nvidia.com>	Remove main Unit tests don't build if main is enabled. Change-Id: I4c7210b2a72032d6e15729b5ab5e4201739dd602 Signed-off-by: Ed Tanous <etanous@nvidia.com> tests/NvidiaGpuSensorTest.cpp
12