History log of /openbmc/dbus-sensors/src/nvidia-gpu/NvidiaDeviceDiscovery.hpp (Results 1 – 2 of 2)
Revision Date Author Comments
# 8951c87e 25-Jun-2025 Harshit Aghera <haghera@nvidia.com>

nvidia-gpu: add SMA Temperature Sensor

Add support for device type SMA (System Management Agent) and its
temperature sensor. It is typically an MCU device.

Tested: Build an image for gb200nvl-obmc

nvidia-gpu: add SMA Temperature Sensor

Add support for device type SMA (System Management Agent) and its
temperature sensor. It is typically an MCU device.

Tested: Build an image for gb200nvl-obmc machine with the following
patches cherry picked. This patches are needed to enable the mctp stack.

https://gerrit.openbmc.org/c/openbmc/openbmc/+/79422

```
$ curl -s -k -u 'root:0penBmc' https://10.137.203.193/redfish/v1/Chassis/NVIDIA_GB200_1/Sensors/temperature_NVIDIA_GB200_GPU_SMA_255_TEMP_0
{
"@odata.id": "/redfish/v1/Chassis/NVIDIA_GB200_1/Sensors/temperature_NVIDIA_GB200_GPU_SMA_255_TEMP_0",
"@odata.type": "#Sensor.v1_2_0.Sensor",
"Id": "temperature_NVIDIA_GB200_GPU_SMA_255_TEMP_0",
"Name": "NVIDIA GB200 GPU SMA 255 TEMP 0",
"Reading": 34.0,
"ReadingRangeMax": 127.0,
"ReadingRangeMin": -128.0,
"ReadingType": "Temperature",
"ReadingUnits": "Cel",
"Status": {
"Health": "OK",
"State": "Enabled"
}
}%
```

Change-Id: I560864758036a5b6ea6c1745145736c7bfa0a1c5
Signed-off-by: Harshit Aghera <haghera@nvidia.com>

show more ...


# 4ecdfaaa 22-May-2025 Harshit Aghera <haghera@nvidia.com>

nvidia-gpu: introduce notion of a device

Perform device discovery tasks only once per device to prepare for
introducing additional gpu sensors.

In the current implementation, sensor updates and dev

nvidia-gpu: introduce notion of a device

Perform device discovery tasks only once per device to prepare for
introducing additional gpu sensors.

In the current implementation, sensor updates and device discovery via
MCTP are managed within a single class for simplicity. However, since a
GPU device typically includes multiple sensors, performing device
discovery for each individual sensor is inefficient. Instead, it would
be more effective to execute device discovery once per device.

Tested: Build an image for gb200nvl-obmc machine with the following
patches cherry picked. This patches are needed to enable the mctp stack.
https://gerrit.openbmc.org/c/openbmc/openbmc/+/79422

```
$ curl -k -u 'root:0penBmc' https://10.137.203.137/redfish/v1/Chassis/NVIDIA_GB200_1/Sensors/temperature_NVIDIA_GB200_GPU_0_TEMP_0
{
"@odata.id": "/redfish/v1/Chassis/NVIDIA_GB200_1/Sensors/temperature_NVIDIA_GB200_GPU_0_TEMP_0",
"@odata.type": "#Sensor.v1_2_0.Sensor",
"Id": "temperature_NVIDIA_GB200_GPU_0_TEMP_0",
"Name": "NVIDIA GB200 GPU 0 TEMP 0",
"Reading": 37.6875,
"ReadingRangeMax": 127.0,
"ReadingRangeMin": -128.0,
"ReadingType": "Temperature",
"ReadingUnits": "Cel",
"Status": {
"Health": "OK",
"State": "Enabled"
}
}%
```

Change-Id: Ie3dcd43caa031b4aaa61d8be3f5d71aefd53bc9a
Signed-off-by: Harshit Aghera <haghera@nvidia.com>

show more ...