| 6ef89739 | 21-Oct-2025 |
Ed Tanous <etanous@nvidia.com> |
nvidia-gpu: Use common class for mctp endpoints
The common endpoint class should be used for send and receive.
Tested: Used by MctpRequester, tested on nvl32-obmc
Change-Id: I0060a66a5bcb4decfbe66
nvidia-gpu: Use common class for mctp endpoints
The common endpoint class should be used for send and receive.
Tested: Used by MctpRequester, tested on nvl32-obmc
Change-Id: I0060a66a5bcb4decfbe663d46ba88529e01e2209 Signed-off-by: Ed Tanous <etanous@nvidia.com>
show more ...
|
| 964057d1 | 17-Nov-2025 |
George Liu <liuxiwei@ieisystem.com> |
Remove redundant is_method_error() checks
The handlers registered through sdbusplus::bus::match_t only receive D-Bus signals. Signal messages are never sent as method-error replies, and therefore me
Remove redundant is_method_error() checks
The handlers registered through sdbusplus::bus::match_t only receive D-Bus signals. Signal messages are never sent as method-error replies, and therefore message.is_method_error() can never be true in these callbacks.
This change removes all unnecessary is_method_error() checks from signal handlers to simplify the code and avoid confusion.
Change-Id: I43e4a564c1bf401a5da9819dd201464e4a59c871 Signed-off-by: George Liu <liuxiwei@ieisystem.com>
show more ...
|
| 33ba62c7 | 07-Nov-2025 |
Harshit Aghera <haghera@nvidia.com> |
request maintainer role for nvidia-gpu
I have actively contributed to and reviewed patches for nvidia-gpu application since its inception in May 2025. Additionally, I have contributed and reviewed p
request maintainer role for nvidia-gpu
I have actively contributed to and reviewed patches for nvidia-gpu application since its inception in May 2025. Additionally, I have contributed and reviewed phosphor-dbus-interfaces and bmcweb patches related to nvidia-gpu application.
Change-Id: I8eca227699b09c5cdb49495d5237a545c8609e86 Signed-off-by: Harshit Aghera <haghera@nvidia.com>
show more ...
|
| 77239da5 | 24-Nov-2025 |
Ed Tanous <etanous@nvidia.com> |
Fix test build if nvidia-gpu is disabled
When nvidia-gpu is disabled, unit tests don't build because of the shared gpusensor_sources variable. Make a quick fix to fix the build. Going forward we ma
Fix test build if nvidia-gpu is disabled
When nvidia-gpu is disabled, unit tests don't build because of the shared gpusensor_sources variable. Make a quick fix to fix the build. Going forward we may need the option checking to be put into the sub meson file rather than the top level, so that unit test deps can build separately.
Change-Id: Ib87487fe15e80df44afbd9c3421163c6fbc16f74 Signed-off-by: Ed Tanous <etanous@nvidia.com>
show more ...
|
| 064e6ff7 | 27-Oct-2025 |
Deepak Kodihalli <deepak.kodihalli.83@gmail.com> |
nvidia-gpu: fix GPU power PeakReading PDI usage
The GPU power peak reading, which uses the Telemetry.Report PDI, was relying on a string ("PeakReading") to expose the reading. This string is Redfish
nvidia-gpu: fix GPU power PeakReading PDI usage
The GPU power peak reading, which uses the Telemetry.Report PDI, was relying on a string ("PeakReading") to expose the reading. This string is Redfish specific. Instead, use the OperationType.Maximum enum defined in the PDI. Bmcweb code can map this to PeakReading.
Tested: Build an image for nvl32-obmc machine with the following patches cherry picked.
https://gerrit.openbmc.org/c/openbmc/openbmc/+/85490 https://gerrit.openbmc.org/c/openbmc/bmcweb/+/82449.
The patch cherry-picks the following patches that are currently under review.
``` 1. device tree https://lore.kernel.org/all/aRbLqH8pLWCQryhu@molberding.nvidia.com/ 2. mctpd patches https://github.com/CodeConstruct/mctp/pull/85 3. u-boot changes https://lore.kernel.org/openbmc/20251121-msx4-v1-0-fc0118b666c1@nvidia.com/T/#t 4. kernel changes as specified in the openbmc patch (for espi) 5. entity-manager changes https://gerrit.openbmc.org/c/openbmc/entity-manager/+/85455 6. platform-init changes https://gerrit.openbmc.org/c/openbmc/platform-init/+/85456 7. spi changes https://lore.kernel.org/all/20251121-w25q01jv_fixup-v1-1-3d175050db73@nvidia.com/ ```
The GPU Power PeakReading is correctly reported on DBus and on redfish.
Change-Id: I39b2b4987d845f878ffdedcfdb02cdfdc02a4499 Signed-off-by: Deepak Kodihalli <deepak.kodihalli.83@gmail.com> Signed-off-by: Harshit Aghera <haghera@nvidia.com>
show more ...
|
| e0b80e1e | 28-Aug-2025 |
Harshit Aghera <haghera@nvidia.com> |
nvidia-gpu: add support for ConnectX device
Add support to discover ConnectX devices and to populate PCIe interface properties using Phosphor DBus Interface xyz.openbmc_project.Inventory.Item.PCIeDe
nvidia-gpu: add support for ConnectX device
Add support to discover ConnectX devices and to populate PCIe interface properties using Phosphor DBus Interface xyz.openbmc_project.Inventory.Item.PCIeDevice.
ConnectX device has an integrated PCIe Switch. The patch uses xyz.openbmc_project.Inventory.Item.PCIeSwitch PDI to define the PCIe Switch resource.
Tested: Build an image for nvl32-obmc machine with the following patch cherry picked.
https://gerrit.openbmc.org/c/openbmc/openbmc/+/85490
The patch cherry-picks the following patches that are currently under review.
``` 1. device tree https://lore.kernel.org/all/aRbLqH8pLWCQryhu@molberding.nvidia.com/ 2. mctpd patches https://github.com/CodeConstruct/mctp/pull/85 3. u-boot changes https://lore.kernel.org/openbmc/20251121-msx4-v1-0-fc0118b666c1@nvidia.com/T/#t 4. kernel changes as specified in the openbmc patch (for espi) 5. entity-manager changes https://gerrit.openbmc.org/c/openbmc/entity-manager/+/85455 6. platform-init changes https://gerrit.openbmc.org/c/openbmc/platform-init/+/85456 7. spi changes https://lore.kernel.org/all/20251121-w25q01jv_fixup-v1-1-3d175050db73@nvidia.com/ ```
``` root@nvl32-bmc:~# busctl tree xyz.openbmc_project.GpuSensor `- /xyz `- /xyz/openbmc_project |- /xyz/openbmc_project/inventory | `- /xyz/openbmc_project/inventory/pcie_devices | |- /xyz/openbmc_project/inventory/pcie_devices/Nvidia_ConnectX_0 | |- /xyz/openbmc_project/inventory/pcie_devices/Nvidia_ConnectX_1 | |- /xyz/openbmc_project/inventory/pcie_devices/Nvidia_ConnectX_2 | `- /xyz/openbmc_project/inventory/pcie_devices/Nvidia_ConnectX_3
root@nvl32-obmc:~# busctl introspect xyz.openbmc_project.GpuSensor /xyz/openbmc_project/inventory/pcie_devices/Nvidia_ConnectX_0 NAME TYPE SIGNATURE RESULT/VALUE FLAGS org.freedesktop.DBus.Introspectable interface - - - .Introspect method - s - org.freedesktop.DBus.Peer interface - - - .GetMachineId method - s - .Ping method - - - org.freedesktop.DBus.Properties interface - - - .Get method ss v - .GetAll method s a{sv} - .Set method ssv - - .PropertiesChanged signal sa{sv}as - - xyz.openbmc_project.Inventory.Item.PCIeDevice interface - - - .GenerationInUse property s "xyz.openbmc_project.Inventory.Item.P... emits-change .GenerationSupported property s "xyz.openbmc_project.Inventory.Item.P... emits-change .LanesInUse property u 8 emits-change .MaxLanes property u 16 emits-change xyz.openbmc_project.Inventory.Item.PCIeSwitch interface - - -
$ curl -s -k -u 'root:0penBmc' https://${bmc_ip}/redfish/v1/Systems/system/PCIeDevices/Nvidia_ConnectX_0 { "@odata.id": "/redfish/v1/Systems/system/PCIeDevices/Nvidia_ConnectX_0", "@odata.type": "#PCIeDevice.v1_19_0.PCIeDevice", "Id": "Nvidia_ConnectX_0", "Name": "PCIe Device", "PCIeFunctions": { "@odata.id": "/redfish/v1/Systems/system/PCIeDevices/Nvidia_ConnectX_0/PCIeFunctions" }, "PCIeInterface": { "LanesInUse": 8, "MaxLanes": 16, "MaxPCIeType": "Gen5", "PCIeType": "Gen5" }, "Status": { "Health": "OK", "State": "Enabled" } }% ```
Change-Id: Id89ce8a298ebb16934e94efcb9ca4679f91a7b26 Signed-off-by: Harshit Aghera <haghera@nvidia.com>
show more ...
|
| db74edb9 | 29-Sep-2025 |
Ed Tanous <etanous@nvidia.com> |
nvidia-gpu: move unused member
Move MaxMessageSize to where it's used
Change-Id: I6c45157e6e3e52672cab86c82af1ea45a3628d19 Signed-off-by: Ed Tanous <etanous@nvidia.com> |
| 779d84f0 | 29-Sep-2025 |
Ed Tanous <etanous@nvidia.com> |
nvidia-gpu: Declare send endpoint on stack
There's no reason to store this small class in between transactions. Just construct on stack as part of the send.
Tested: On last patchset in series
Chan
nvidia-gpu: Declare send endpoint on stack
There's no reason to store this small class in between transactions. Just construct on stack as part of the send.
Tested: On last patchset in series
Change-Id: I00090942665f022bfa2552b9c31c7c3da000646b Signed-off-by: Ed Tanous <etanous@nvidia.com>
show more ...
|
| b5e823f7 | 09-Oct-2025 |
Ed Tanous <ed@tanous.net> |
Change copyright to match linux foundation
We should use SPDX identifiers wherever possible for simplification.
Change-Id: If3a7bfe506d7fded64a3ac929cc643834b16303e Signed-off-by: Ed Tanous <etanou
Change copyright to match linux foundation
We should use SPDX identifiers wherever possible for simplification.
Change-Id: If3a7bfe506d7fded64a3ac929cc643834b16303e Signed-off-by: Ed Tanous <etanous@nvidia.com>
show more ...
|
| 3f6bc731 | 23-Jul-2025 |
Harshit Aghera <haghera@nvidia.com> |
nvidia-gpu: add TLimit sensor properties
Add support for DMTF Redfish properties ReadingBasis and Implementation for GPU TLimit sensor [1].
Property Implementation for TLimit is set to Synthesized
nvidia-gpu: add TLimit sensor properties
Add support for DMTF Redfish properties ReadingBasis and Implementation for GPU TLimit sensor [1].
Property Implementation for TLimit is set to Synthesized because the GPU incorporates intelligent logic that determines the temperature delta from the first thermal management software slowdown event. TLimit is derived from other reported GPU sensors, such as HBM, Tavg, and others.
DBus Interface definition - https://gerrit.openbmc.org/c/openbmc/phosphor-dbus-interfaces/+/81658
Tested: Build an image for gb200nvl-obmc machine with the following patches cherry picked. This patches are needed to enable the mctp stack.
https://gerrit.openbmc.org/c/openbmc/openbmc/+/79422
``` > curl -s -k -u 'root:0penBmc' https://10.137.203.137/redfish/v1/Chassis/NVIDIA_GB200_1/Sensors/temperature_NVIDIA_GB200_GPU_0_TEMP_1 { "@odata.id": "/redfish/v1/Chassis/NVIDIA_GB200_1/Sensors/temperature_NVIDIA_GB200_GPU_0_TEMP_1", "@odata.type": "#Sensor.v1_2_0.Sensor", "Description": "Thermal Limit(TLIMIT) Temperature is the distance in deg C from the GPU temperature to the first throttle limit.", "Id": "temperature_NVIDIA_GB200_GPU_0_TEMP_1", "Implementation": "Synthesized", "Name": "NVIDIA GB200 GPU 0 TEMP 1", "Reading": 56.59375, "ReadingBasis": "Headroom", "ReadingRangeMax": 127.0, "ReadingRangeMin": -128.0, "ReadingType": "Temperature", "ReadingUnits": "Cel", "Status": { "Health": "OK", "State": "Enabled" } }%
root@gb200nvl-obmc:~# busctl introspect xyz.openbmc_project.GpuSensor /xyz/openbmc_project/sensors/temperature/NVIDIA_GB200_GPU_0_TEMP_1 NAME TYPE SIGNATURE RESULT/VALUE FLAGS org.freedesktop.DBus.Introspectable interface - - - .Introspect method - s - org.freedesktop.DBus.Peer interface - - - .GetMachineId method - s - .Ping method - - - org.freedesktop.DBus.Properties interface - - - .Get method ss v - .GetAll method s a{sv} - .Set method ssv - - .PropertiesChanged signal sa{sv}as - - xyz.openbmc_project.Association.Definitions interface - - - .Associations property a(sss) 1 "chassis" "all_sensors" "/xyz/openb... emits-change xyz.openbmc_project.Inventory.Item interface - - - .PrettyName property s "Thermal Limit(TLIMIT) Temperature is... emits-change xyz.openbmc_project.Sensor.Type interface - - - .Implementation property s "xyz.openbmc_project.Sensor.Type.Impl... emits-change .ReadingBasis property s "xyz.openbmc_project.Sensor.Type.Read... emits-change xyz.openbmc_project.Sensor.Value interface - - - .MaxValue property d 127 emits-change .MinValue property d -128 emits-change .Unit property s "xyz.openbmc_project.Sensor.Value.Uni... emits-change .Value property d 56.6836 emits-change writable xyz.openbmc_project.Sensor.ValueMutability interface - - - .Mutable property b true emits-change xyz.openbmc_project.State.Decorator.Availability interface - - - .Available property b true emits-change writable xyz.openbmc_project.State.Decorator.OperationalStatus interface - - - .Functional property b true emits-change ```
[1] : https://redfish.dmtf.org/schemas/v1/Sensor.v1_11_0.yaml
Change-Id: I1a16ced44c563794d561d26232a5e5fba041b875 Signed-off-by: Harshit Aghera <haghera@nvidia.com>
show more ...
|
| 1851f645 | 29-Sep-2025 |
Marc Olberding <molberding@nvidia.com> |
nvidia-gpu: Fix thresholds for GPU_TEMP_1
Fixes thresholds for GPU_TEMP_1 to be upper critical, warning, shutdown. Rather than lower critical, et al.
Change-Id: I580766288f3d27a48c75f00ea1dab13f028
nvidia-gpu: Fix thresholds for GPU_TEMP_1
Fixes thresholds for GPU_TEMP_1 to be upper critical, warning, shutdown. Rather than lower critical, et al.
Change-Id: I580766288f3d27a48c75f00ea1dab13f0284bed6 Signed-off-by: Marc Olberding <molberding@nvidia.com>
show more ...
|
| fd4a3779 | 24-Sep-2025 |
Marc Olberding <molberding@nvidia.com> |
nvidia-gpu: Fix a number of object lifetime issues
Moves all subsensors and objects treated as shared_ptrs to be using shared_from_this. This way, if there's an object lifetime issue we don't segfau
nvidia-gpu: Fix a number of object lifetime issues
Moves all subsensors and objects treated as shared_ptrs to be using shared_from_this. This way, if there's an object lifetime issue we don't segfault.
Also separates construction and asio init for NvidiaSmaDevice so that when we bind to this, its valid after we leave the ctor
Change-Id: I8e3115bc276d2e0eaac0b1dc9a9d2c46e6751d4b Signed-off-by: Marc Olberding <molberding@nvidia.com>
show more ...
|
| 6282a452 | 29-Sep-2025 |
Marc Olberding <molberding@nvidia.com> |
nvidia-gpu: NvidiaGpuDevice fix use after free
Fixes use after free for NvidiaGpuThresholds. Moves the storage used for communication to be part of the NvidiaGpuDevice class instead of ephemerally p
nvidia-gpu: NvidiaGpuDevice fix use after free
Fixes use after free for NvidiaGpuThresholds. Moves the storage used for communication to be part of the NvidiaGpuDevice class instead of ephemerally passed around through free functions
Also makes NvidiaGpuDevice inherit from std::enable_shared_from_this
Testing: Issue found previous was coredumps on nvl32-obmc. Asan discovered it was a use after free in the shared pointer in ThermalLimits
Afterwards, no core dumps or issues reported by asan. Ran on an nvl32-obmc model with 8 GPU's
Change-Id: I61b606f3a129499089718e7ec804926db5f22c64 Signed-off-by: Marc Olberding <molberding@nvidia.com>
show more ...
|
| ac920734 | 28-Sep-2025 |
Marc Olberding <molberding@nvidia.com> |
nvidia-gpu: deferred init for NvidiaGpuDevice
Adds deferred init for NvidiaGpuDevice, so that when we bind to this, the this pointer is valid, i.e. after construction is completed
Change-Id: I24a53
nvidia-gpu: deferred init for NvidiaGpuDevice
Adds deferred init for NvidiaGpuDevice, so that when we bind to this, the this pointer is valid, i.e. after construction is completed
Change-Id: I24a53d2ab9be1a2a4431368414a154b48347d2a2 Signed-off-by: Marc Olberding <molberding@nvidia.com>
show more ...
|
| d0125c9c | 08-Oct-2025 |
Marc Olberding <molberding@nvidia.com> |
nvidia-gpu: Fix up buffering in MctpRequester
This change does a lot, for better or worse 1. Change MctpRequester to hold both buffers for send and receive 2. This requires changing the callback str
nvidia-gpu: Fix up buffering in MctpRequester
This change does a lot, for better or worse 1. Change MctpRequester to hold both buffers for send and receive 2. This requires changing the callback structure, so the reach is far 3. Changes error reporting to be through std::error_code 4. Collapses the QueuingRequeuster and Requeuster to be MctpRequeuster 5. Doing 4 gets rid of a level indirection and an extra unordered_map 6. Adds proper iid support, which is made significantly easier by 4/5 7. Fixes issues around expiry timer's where we would cancel the timer for a given request whenever a new packet would come in to be sent. This could cause lockup if a packet truly did time out and an interleaved packet finished sending. This moves each queue to have its own timer.
This fixes an issue where we were receiving buffers in from clients and then binding them to receive_calls without ensuring that they are the correct message, thus when receive was called, it was called with the last bound buffer to async_receive_from. This would cause a number of issues, ranging from incorrect device discovery results to core dumps as well as incorrect sensor readings.
This change moves the receive and send buffers to be owned by the MctpRequester, and a non-owning view is provided via callback to the client. All existing clients just decode in place given that buffer.
Tested: loaded onto nvl32-obmc. Correct number of sensors showed up and the readings were nominal
Change-Id: I67c843691ca79e9fcccfa16df6d611918f25f6ca Signed-off-by: Marc Olberding <molberding@nvidia.com>
show more ...
|
| 6b712322 | 31-Jul-2025 |
Harshit Aghera <haghera@nvidia.com> |
nvidia-gpu: add Power Sensor PeakReading Property
Add support for Sensor Properties PeakReading and PeakRedingTime.
Current Limitation - The ResetMetrics action is currently not supported for Redfi
nvidia-gpu: add Power Sensor PeakReading Property
Add support for Sensor Properties PeakReading and PeakRedingTime.
Current Limitation - The ResetMetrics action is currently not supported for Redfish URIs in bmcweb. As a result, the ability to clear PeakReading values for GPU Power Sensors has not been implemented.
Future Consideration - If ResetMetrics action support is added to bmcweb in the future, the corresponding functionality will also need to be implemented in the dbus-sensor application to ensure full compatibility.
Tested: Build an image for gb200nvl-obmc machine with the following patches cherry picked. This patches are needed to enable the mctp stack.
https://gerrit.openbmc.org/c/openbmc/openbmc/+/79422
``` root@gb200nvl-obmc:~# busctl introspect xyz.openbmc_project.GpuSensor /xyz/openbmc_project/sensors/power/NVIDIA_GB200_GPU_0_Power_0 NAME TYPE SIGNATURE RESULT/VALUE FLAGS org.freedesktop.DBus.Introspectable interface - - - .Introspect method - s - org.freedesktop.DBus.Peer interface - - - .GetMachineId method - s - .Ping method - - - org.freedesktop.DBus.Properties interface - - - .Get method ss v - .GetAll method s a{sv} - .Set method ssv - - .PropertiesChanged signal sa{sv}as - - xyz.openbmc_project.Association.Definitions interface - - - .Associations property a(sss) 1 "chassis" "all_sensors" "/xyz/openb... emits-change xyz.openbmc_project.Sensor.Value interface - - - .MaxValue property d 5000 emits-change .MinValue property d 0 emits-change .Unit property s "xyz.openbmc_project.Sensor.Value.Uni... emits-change .Value property d 29.194 emits-change writable xyz.openbmc_project.Sensor.ValueMutability interface - - - .Mutable property b true emits-change xyz.openbmc_project.State.Decorator.Availability interface - - - .Available property b true emits-change writable xyz.openbmc_project.State.Decorator.OperationalStatus interface - - - .Functional property b true emits-change xyz.openbmc_project.Telemetry.Report interface - - - .Readings property (ta(ssdt)) 0 1 "PeakReading" "" 80.933 0 emits-change ```
Change-Id: I0a4f7eb0a5db688f32bf80954839140da9bb7e2a Signed-off-by: Harshit Aghera <haghera@nvidia.com>
show more ...
|
| aba6fcac | 29-Sep-2025 |
Ed Tanous <etanous@nvidia.com> |
Fix tidy build
This appears to be something tidy is wrong about. The suggestion of adding math to the struct initializers appears to not compile.
Move the calculation of hysteresisTrigger and hyst
Fix tidy build
This appears to be something tidy is wrong about. The suggestion of adding math to the struct initializers appears to not compile.
Move the calculation of hysteresisTrigger and hysteresisPublish into the constructor body itself to avoid the warning.
Change-Id: I833fd12966c69c0e081692d6d40ba0cf1805ead1 Signed-off-by: Ed Tanous <etanous@nvidia.com>
show more ...
|
| 87a0745b | 03-Sep-2025 |
Ed Tanous <etanous@nvidia.com> |
Move Nvidia gpu tests
These tests got caught in the refactor. Move these tests to the correct location.
Change-Id: Ie8ec10e154d60cb4f24e1f45be36240863438f87 Signed-off-by: Ed Tanous <etanous@nvidi
Move Nvidia gpu tests
These tests got caught in the refactor. Move these tests to the correct location.
Change-Id: Ie8ec10e154d60cb4f24e1f45be36240863438f87 Signed-off-by: Ed Tanous <etanous@nvidia.com>
show more ...
|
| 6061bbcf | 03-Sep-2025 |
Ed Tanous <etanous@nvidia.com> |
Remove main
Unit tests don't build if main is enabled.
Change-Id: I4c7210b2a72032d6e15729b5ab5e4201739dd602 Signed-off-by: Ed Tanous <etanous@nvidia.com> |
| 271e075a | 03-Sep-2025 |
Ed Tanous <etanous@nvidia.com> |
Fix unit test warnings
These result in unsigned to signed comparisons that gtest and gcc in some configurations warns on. Fix the literals to be unsigned.
Change-Id: I63e522ddefb4bf3a97c1e7b2f3c48
Fix unit test warnings
These result in unsigned to signed comparisons that gtest and gcc in some configurations warns on. Fix the literals to be unsigned.
Change-Id: I63e522ddefb4bf3a97c1e7b2f3c48b159f03ae1f Signed-off-by: Ed Tanous <etanous@nvidia.com>
show more ...
|
| 0ad57100 | 13-Jun-2025 |
Rohit PAI <ropai@nvidia.com> |
Nvidia-Gpu: Support to fetch model,revision inventory properties
Add capability to fetch model and revision from Nvidia GPU devices using GET inventory command
Tested Able to get model and revision
Nvidia-Gpu: Support to fetch model,revision inventory properties
Add capability to fetch model and revision from Nvidia GPU devices using GET inventory command
Tested Able to get model and revision info from the GPU
``` busctl introspect xyz.openbmc_project.GpuSensor /xyz/openbmc_project/inventory/NVIDIA_GB200_GPU_0 NAME TYPE SIGNATURE RESULT/VALUE FLAGS org.freedesktop.DBus.Introspectable interface - - - .Introspect method - s - org.freedesktop.DBus.Peer interface - - - .GetMachineId method - s - .Ping method - - - org.freedesktop.DBus.Properties interface - - - .Get method ss v - .GetAll method s a{sv} - .Set method ssv - - .PropertiesChanged signal sa{sv}as - - xyz.openbmc_project.Common.UUID interface - - - .UUID property s "6c13dc0f-ec0c-1fc9-db63-4d9f1053b5ef" emits-change xyz.openbmc_project.Inventory.Decorator.Asset interface - - - .Manufacturer property s "NVIDIA" emits-change .Model property s "RTXPRO6000BlackwellDC" emits-change .PartNumber property s "B40GPU" emits-change .SerialNumber property s "1641425000136" emits-change xyz.openbmc_project.Inventory.Decorator.Revision interface - - - .Version property s "2BB5-895-A1" emits-change xyz.openbmc_project.Inventory.Item.Accelerator interface - - - .Type property s "GPU" emits-change
```
Change-Id: Ib0870f1680687272e58c49726618aeee332e3d4a Signed-off-by: Rohit PAI <ropai@nvidia.com>
show more ...
|
| fb64f063 | 13-Jun-2025 |
Rohit PAI <ropai@nvidia.com> |
Nvidia-GPU: Add UUID support for GPU device
Support for fetching UUID information using Get Inventory command
Tested - Able to get UUID info from the GPU device and populate that on Dbus
``` busc
Nvidia-GPU: Add UUID support for GPU device
Support for fetching UUID information using Get Inventory command
Tested - Able to get UUID info from the GPU device and populate that on Dbus
``` busctl introspect xyz.openbmc_project.GpuSensor /xyz/openbmc_project/inventory/NVIDIA_GB200_GPU_0 NAME TYPE SIGNATURE RESULT/VALUE FLAGS org.freedesktop.DBus.Introspectable interface - - - .Introspect method - s - org.freedesktop.DBus.Peer interface - - - .GetMachineId method - s - .Ping method - - - org.freedesktop.DBus.Properties interface - - - .Get method ss v - .GetAll method s a{sv} - .Set method ssv - - .PropertiesChanged signal sa{sv}as - - xyz.openbmc_project.Common.UUID interface - - - .UUID property s "6c13dc0f-ec0c-1fc9-db63-4d9f1053b5ef" emits-change xyz.openbmc_project.Inventory.Decorator.Asset interface - - - .PartNumber property s "B40GPU" emits-change .SerialNumber property s "1641425000136" emits-change xyz.openbmc_project.Inventory.Item.Accelerator interface - - - .Type property s "GPU" emits-change
```
Change-Id: I4600e85b3bf00e68032bb2b960cb803a76f6af96 Signed-off-by: Rohit PAI <ropai@nvidia.com>
show more ...
|
| ada6baa9 | 01-Jul-2025 |
Rohit PAI <ropai@nvidia.com> |
Nvidia-Gpu: Support for Nvidia GPU Serial Number, Part Number
Support for serial number and part number fetch is added in inventory class which uses the Get Inventory Command. Currently we have a re
Nvidia-Gpu: Support for Nvidia GPU Serial Number, Part Number
Support for serial number and part number fetch is added in inventory class which uses the Get Inventory Command. Currently we have a retry policy of 3 retires to account of any failures to get response from the GPU device.
Tested - Able to get Serial Number, Part Number updated from the GPU device
``` busctl introspect xyz.openbmc_project.GpuSensor /xyz/openbmc_project/inventory/NVIDIA_GB200_GPU_0 NAME TYPE SIGNATURE RESULT/VALUE FLAGS org.freedesktop.DBus.Introspectable interface - - - .Introspect method - s - org.freedesktop.DBus.Peer interface - - - .GetMachineId method - s - .Ping method - - - org.freedesktop.DBus.Properties interface - - - .Get method ss v - .GetAll method s a{sv} - .Set method ssv - - .PropertiesChanged signal sa{sv}as - - xyz.openbmc_project.Inventory.Decorator.Asset interface - - - .PartNumber property s "699-2G153-0210-TS1" emits-change .SerialNumber property s "1330325220002" emits-change xyz.openbmc_project.Inventory.Item.Accelerator interface - - - .Type property s "GPU" emits-change
```
Change-Id: Id2b33a66ff6d5480f8e229fa233528afc0bdcfc0 Signed-off-by: Rohit PAI <ropai@nvidia.com>
show more ...
|
| 0a88826f | 10-Jun-2025 |
Rohit PAI <ropai@nvidia.com> |
Nvidia-gpu: Create GPU Inventory device
GPU device class implements Item.accelerator interface to get identified as as GPU device. This will be used in Redfish to populate the GPU processor schema.
Nvidia-gpu: Create GPU Inventory device
GPU device class implements Item.accelerator interface to get identified as as GPU device. This will be used in Redfish to populate the GPU processor schema.
Tested - ``` root@gb200nvl-obmc:~# busctl introspect xyz.openbmc_project.GpuSensor /xyz/openbmc_project/inventory/NVIDIA_GB200_GPU_0 NAME TYPE SIGNATURE RESULT/VALUE FLAGS org.freedesktop.DBus.Introspectable interface - - - .Introspect method - s - org.freedesktop.DBus.Peer interface - - - .GetMachineId method - s - .Ping method - - - org.freedesktop.DBus.Properties interface - - - .Get method ss v - .GetAll method s a{sv} - .Set method ssv - - .PropertiesChanged signal sa{sv}as - - xyz.openbmc_project.Inventory.Item.Accelerator interface - - - .Type property s "GPU" emits-change ```
Change-Id: I20434529860cb37889e63651bbcd97cadfa9d54e Signed-off-by: Rohit PAI <ropai@nvidia.com>
show more ...
|
| 86786b6c | 09-Jun-2025 |
Rohit PAI <ropai@nvidia.com> |
Nvidia-gpu: Encode/decode APIs for GPU Inventory
Added support for encoding and decoding Get Inventory command. The command supports fetching inventory properties including Serial Number, Part Numbe
Nvidia-gpu: Encode/decode APIs for GPU Inventory
Added support for encoding and decoding Get Inventory command. The command supports fetching inventory properties including Serial Number, Part Number, Marketing Name, etc for Nvidia GPUs
Tested - added new UT for encode and decode APIs which pass
Change-Id: I9e5afbe356b64fd7ae4f7a2a65043f3eeffa3807 Signed-off-by: Rohit PAI <ropai@nvidia.com>
show more ...
|