1.. SPDX-License-Identifier: GPL-2.0 2 3=================== 4ice devlink support 5=================== 6 7This document describes the devlink features implemented by the ``ice`` 8device driver. 9 10Info versions 11============= 12 13The ``ice`` driver reports the following versions 14 15.. list-table:: devlink info versions implemented 16 :widths: 5 5 5 90 17 18 * - Name 19 - Type 20 - Example 21 - Description 22 * - ``board.id`` 23 - fixed 24 - K65390-000 25 - The Product Board Assembly (PBA) identifier of the board. 26 * - ``fw.mgmt`` 27 - running 28 - 2.1.7 29 - 3-digit version number of the management firmware running on the 30 Embedded Management Processor of the device. It controls the PHY, 31 link, access to device resources, etc. Intel documentation refers to 32 this as the EMP firmware. 33 * - ``fw.mgmt.api`` 34 - running 35 - 1.5.1 36 - 3-digit version number (major.minor.patch) of the API exported over 37 the AdminQ by the management firmware. Used by the driver to 38 identify what commands are supported. Historical versions of the 39 kernel only displayed a 2-digit version number (major.minor). 40 * - ``fw.mgmt.build`` 41 - running 42 - 0x305d955f 43 - Unique identifier of the source for the management firmware. 44 * - ``fw.undi`` 45 - running 46 - 1.2581.0 47 - Version of the Option ROM containing the UEFI driver. The version is 48 reported in ``major.minor.patch`` format. The major version is 49 incremented whenever a major breaking change occurs, or when the 50 minor version would overflow. The minor version is incremented for 51 non-breaking changes and reset to 1 when the major version is 52 incremented. The patch version is normally 0 but is incremented when 53 a fix is delivered as a patch against an older base Option ROM. 54 * - ``fw.psid.api`` 55 - running 56 - 0.80 57 - Version defining the format of the flash contents. 58 * - ``fw.bundle_id`` 59 - running 60 - 0x80002ec0 61 - Unique identifier of the firmware image file that was loaded onto 62 the device. Also referred to as the EETRACK identifier of the NVM. 63 * - ``fw.app.name`` 64 - running 65 - ICE OS Default Package 66 - The name of the DDP package that is active in the device. The DDP 67 package is loaded by the driver during initialization. Each 68 variation of the DDP package has a unique name. 69 * - ``fw.app`` 70 - running 71 - 1.3.1.0 72 - The version of the DDP package that is active in the device. Note 73 that both the name (as reported by ``fw.app.name``) and version are 74 required to uniquely identify the package. 75 * - ``fw.app.bundle_id`` 76 - running 77 - 0xc0000001 78 - Unique identifier for the DDP package loaded in the device. Also 79 referred to as the DDP Track ID. Can be used to uniquely identify 80 the specific DDP package. 81 * - ``fw.netlist`` 82 - running 83 - 1.1.2000-6.7.0 84 - The version of the netlist module. This module defines the device's 85 Ethernet capabilities and default settings, and is used by the 86 management firmware as part of managing link and device 87 connectivity. 88 * - ``fw.netlist.build`` 89 - running 90 - 0xee16ced7 91 - The first 4 bytes of the hash of the netlist module contents. 92 93Flash Update 94============ 95 96The ``ice`` driver implements support for flash update using the 97``devlink-flash`` interface. It supports updating the device flash using a 98combined flash image that contains the ``fw.mgmt``, ``fw.undi``, and 99``fw.netlist`` components. 100 101.. list-table:: List of supported overwrite modes 102 :widths: 5 95 103 104 * - Bits 105 - Behavior 106 * - ``DEVLINK_FLASH_OVERWRITE_SETTINGS`` 107 - Do not preserve settings stored in the flash components being 108 updated. This includes overwriting the port configuration that 109 determines the number of physical functions the device will 110 initialize with. 111 * - ``DEVLINK_FLASH_OVERWRITE_SETTINGS`` and ``DEVLINK_FLASH_OVERWRITE_IDENTIFIERS`` 112 - Do not preserve either settings or identifiers. Overwrite everything 113 in the flash with the contents from the provided image, without 114 performing any preservation. This includes overwriting device 115 identifying fields such as the MAC address, VPD area, and device 116 serial number. It is expected that this combination be used with an 117 image customized for the specific device. 118 119The ice hardware does not support overwriting only identifiers while 120preserving settings, and thus ``DEVLINK_FLASH_OVERWRITE_IDENTIFIERS`` on its 121own will be rejected. If no overwrite mask is provided, the firmware will be 122instructed to preserve all settings and identifying fields when updating. 123 124Reload 125====== 126 127The ``ice`` driver supports activating new firmware after a flash update 128using ``DEVLINK_CMD_RELOAD`` with the ``DEVLINK_RELOAD_ACTION_FW_ACTIVATE`` 129action. 130 131.. code:: shell 132 133 $ devlink dev reload pci/0000:01:00.0 reload action fw_activate 134 135The new firmware is activated by issuing a device specific Embedded 136Management Processor reset which requests the device to reset and reload the 137EMP firmware image. 138 139The driver does not currently support reloading the driver via 140``DEVLINK_RELOAD_ACTION_DRIVER_REINIT``. 141 142Port split 143========== 144 145The ``ice`` driver supports port splitting only for port 0, as the FW has 146a predefined set of available port split options for the whole device. 147 148A system reboot is required for port split to be applied. 149 150The following command will select the port split option with 4 ports: 151 152.. code:: shell 153 154 $ devlink port split pci/0000:16:00.0/0 count 4 155 156The list of all available port options will be printed to dynamic debug after 157each ``split`` and ``unsplit`` command. The first option is the default. 158 159.. code:: shell 160 161 ice 0000:16:00.0: Available port split options and max port speeds (Gbps): 162 ice 0000:16:00.0: Status Split Quad 0 Quad 1 163 ice 0000:16:00.0: count L0 L1 L2 L3 L4 L5 L6 L7 164 ice 0000:16:00.0: Active 2 100 - - - 100 - - - 165 ice 0000:16:00.0: 2 50 - 50 - - - - - 166 ice 0000:16:00.0: Pending 4 25 25 25 25 - - - - 167 ice 0000:16:00.0: 4 25 25 - - 25 25 - - 168 ice 0000:16:00.0: 8 10 10 10 10 10 10 10 10 169 ice 0000:16:00.0: 1 100 - - - - - - - 170 171There could be multiple FW port options with the same port split count. When 172the same port split count request is issued again, the next FW port option with 173the same port split count will be selected. 174 175``devlink port unsplit`` will select the option with a split count of 1. If 176there is no FW option available with split count 1, you will receive an error. 177 178Regions 179======= 180 181The ``ice`` driver implements the following regions for accessing internal 182device data. 183 184.. list-table:: regions implemented 185 :widths: 15 85 186 187 * - Name 188 - Description 189 * - ``nvm-flash`` 190 - The contents of the entire flash chip, sometimes referred to as 191 the device's Non Volatile Memory. 192 * - ``shadow-ram`` 193 - The contents of the Shadow RAM, which is loaded from the beginning 194 of the flash. Although the contents are primarily from the flash, 195 this area also contains data generated during device boot which is 196 not stored in flash. 197 * - ``device-caps`` 198 - The contents of the device firmware's capabilities buffer. Useful to 199 determine the current state and configuration of the device. 200 201Users can request an immediate capture of a snapshot via the 202``DEVLINK_CMD_REGION_NEW`` 203 204.. code:: shell 205 206 $ devlink region show 207 pci/0000:01:00.0/nvm-flash: size 10485760 snapshot [] max 1 208 pci/0000:01:00.0/device-caps: size 4096 snapshot [] max 10 209 210 $ devlink region new pci/0000:01:00.0/nvm-flash snapshot 1 211 $ devlink region dump pci/0000:01:00.0/nvm-flash snapshot 1 212 213 $ devlink region dump pci/0000:01:00.0/nvm-flash snapshot 1 214 0000000000000000 0014 95dc 0014 9514 0035 1670 0034 db30 215 0000000000000010 0000 0000 ffff ff04 0029 8c00 0028 8cc8 216 0000000000000020 0016 0bb8 0016 1720 0000 0000 c00f 3ffc 217 0000000000000030 bada cce5 bada cce5 bada cce5 bada cce5 218 219 $ devlink region read pci/0000:01:00.0/nvm-flash snapshot 1 address 0 length 16 220 0000000000000000 0014 95dc 0014 9514 0035 1670 0034 db30 221 222 $ devlink region delete pci/0000:01:00.0/nvm-flash snapshot 1 223 224 $ devlink region new pci/0000:01:00.0/device-caps snapshot 1 225 $ devlink region dump pci/0000:01:00.0/device-caps snapshot 1 226 0000000000000000 01 00 01 00 00 00 00 00 01 00 00 00 00 00 00 00 227 0000000000000010 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 228 0000000000000020 02 00 02 01 32 03 00 00 0a 00 00 00 25 00 00 00 229 0000000000000030 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 230 0000000000000040 04 00 01 00 01 00 00 00 00 00 00 00 00 00 00 00 231 0000000000000050 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 232 0000000000000060 05 00 01 00 03 00 00 00 00 00 00 00 00 00 00 00 233 0000000000000070 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 234 0000000000000080 06 00 01 00 01 00 00 00 00 00 00 00 00 00 00 00 235 0000000000000090 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 236 00000000000000a0 08 00 01 00 00 00 00 00 00 00 00 00 00 00 00 00 237 00000000000000b0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 238 00000000000000c0 12 00 01 00 01 00 00 00 01 00 01 00 00 00 00 00 239 00000000000000d0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 240 00000000000000e0 13 00 01 00 00 01 00 00 00 00 00 00 00 00 00 00 241 00000000000000f0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 242 0000000000000100 14 00 01 00 01 00 00 00 00 00 00 00 00 00 00 00 243 0000000000000110 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 244 0000000000000120 15 00 01 00 01 00 00 00 00 00 00 00 00 00 00 00 245 0000000000000130 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 246 0000000000000140 16 00 01 00 01 00 00 00 00 00 00 00 00 00 00 00 247 0000000000000150 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 248 0000000000000160 17 00 01 00 06 00 00 00 00 00 00 00 00 00 00 00 249 0000000000000170 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 250 0000000000000180 18 00 01 00 01 00 00 00 01 00 00 00 08 00 00 00 251 0000000000000190 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 252 00000000000001a0 22 00 01 00 01 00 00 00 00 00 00 00 00 00 00 00 253 00000000000001b0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 254 00000000000001c0 40 00 01 00 00 08 00 00 08 00 00 00 00 00 00 00 255 00000000000001d0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 256 00000000000001e0 41 00 01 00 00 08 00 00 00 00 00 00 00 00 00 00 257 00000000000001f0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 258 0000000000000200 42 00 01 00 00 08 00 00 00 00 00 00 00 00 00 00 259 0000000000000210 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 260 261 $ devlink region delete pci/0000:01:00.0/device-caps snapshot 1 262 263Devlink Rate 264============ 265 266The ``ice`` driver implements devlink-rate API. It allows for offload of 267the Hierarchical QoS to the hardware. It enables user to group Virtual 268Functions in a tree structure and assign supported parameters: tx_share, 269tx_max, tx_priority and tx_weight to each node in a tree. So effectively 270user gains an ability to control how much bandwidth is allocated for each 271VF group. This is later enforced by the HW. 272 273It is assumed that this feature is mutually exclusive with DCB performed 274in FW and ADQ, or any driver feature that would trigger changes in QoS, 275for example creation of the new traffic class. The driver will prevent DCB 276or ADQ configuration if user started making any changes to the nodes using 277devlink-rate API. To configure those features a driver reload is necessary. 278Correspondingly if ADQ or DCB will get configured the driver won't export 279hierarchy at all, or will remove the untouched hierarchy if those 280features are enabled after the hierarchy is exported, but before any 281changes are made. 282 283This feature is also dependent on switchdev being enabled in the system. 284It's required bacause devlink-rate requires devlink-port objects to be 285present, and those objects are only created in switchdev mode. 286 287If the driver is set to the switchdev mode, it will export internal 288hierarchy the moment VF's are created. Root of the tree is always 289represented by the node_0. This node can't be deleted by the user. Leaf 290nodes and nodes with children also can't be deleted. 291 292.. list-table:: Attributes supported 293 :widths: 15 85 294 295 * - Name 296 - Description 297 * - ``tx_max`` 298 - maximum bandwidth to be consumed by the tree Node. Rate Limit is 299 an absolute number specifying a maximum amount of bytes a Node may 300 consume during the course of one second. Rate limit guarantees 301 that a link will not oversaturate the receiver on the remote end 302 and also enforces an SLA between the subscriber and network 303 provider. 304 * - ``tx_share`` 305 - minimum bandwidth allocated to a tree node when it is not blocked. 306 It specifies an absolute BW. While tx_max defines the maximum 307 bandwidth the node may consume, the tx_share marks committed BW 308 for the Node. 309 * - ``tx_priority`` 310 - allows for usage of strict priority arbiter among siblings. This 311 arbitration scheme attempts to schedule nodes based on their 312 priority as long as the nodes remain within their bandwidth limit. 313 Range 0-7. Nodes with priority 7 have the highest priority and are 314 selected first, while nodes with priority 0 have the lowest 315 priority. Nodes that have the same priority are treated equally. 316 * - ``tx_weight`` 317 - allows for usage of Weighted Fair Queuing arbitration scheme among 318 siblings. This arbitration scheme can be used simultaneously with 319 the strict priority. Range 1-200. Only relative values mater for 320 arbitration. 321 322``tx_priority`` and ``tx_weight`` can be used simultaneously. In that case 323nodes with the same priority form a WFQ subgroup in the sibling group 324and arbitration among them is based on assigned weights. 325 326.. code:: shell 327 328 # enable switchdev 329 $ devlink dev eswitch set pci/0000:4b:00.0 mode switchdev 330 331 # at this point driver should export internal hierarchy 332 $ echo 2 > /sys/class/net/ens785np0/device/sriov_numvfs 333 334 $ devlink port function rate show 335 pci/0000:4b:00.0/node_25: type node parent node_24 336 pci/0000:4b:00.0/node_24: type node parent node_0 337 pci/0000:4b:00.0/node_32: type node parent node_31 338 pci/0000:4b:00.0/node_31: type node parent node_30 339 pci/0000:4b:00.0/node_30: type node parent node_16 340 pci/0000:4b:00.0/node_19: type node parent node_18 341 pci/0000:4b:00.0/node_18: type node parent node_17 342 pci/0000:4b:00.0/node_17: type node parent node_16 343 pci/0000:4b:00.0/node_14: type node parent node_5 344 pci/0000:4b:00.0/node_5: type node parent node_3 345 pci/0000:4b:00.0/node_13: type node parent node_4 346 pci/0000:4b:00.0/node_12: type node parent node_4 347 pci/0000:4b:00.0/node_11: type node parent node_4 348 pci/0000:4b:00.0/node_10: type node parent node_4 349 pci/0000:4b:00.0/node_9: type node parent node_4 350 pci/0000:4b:00.0/node_8: type node parent node_4 351 pci/0000:4b:00.0/node_7: type node parent node_4 352 pci/0000:4b:00.0/node_6: type node parent node_4 353 pci/0000:4b:00.0/node_4: type node parent node_3 354 pci/0000:4b:00.0/node_3: type node parent node_16 355 pci/0000:4b:00.0/node_16: type node parent node_15 356 pci/0000:4b:00.0/node_15: type node parent node_0 357 pci/0000:4b:00.0/node_2: type node parent node_1 358 pci/0000:4b:00.0/node_1: type node parent node_0 359 pci/0000:4b:00.0/node_0: type node 360 pci/0000:4b:00.0/1: type leaf parent node_25 361 pci/0000:4b:00.0/2: type leaf parent node_25 362 363 # let's create some custom node 364 $ devlink port function rate add pci/0000:4b:00.0/node_custom parent node_0 365 366 # second custom node 367 $ devlink port function rate add pci/0000:4b:00.0/node_custom_1 parent node_custom 368 369 # reassign second VF to newly created branch 370 $ devlink port function rate set pci/0000:4b:00.0/2 parent node_custom_1 371 372 # assign tx_weight to the VF 373 $ devlink port function rate set pci/0000:4b:00.0/2 tx_weight 5 374 375 # assign tx_share to the VF 376 $ devlink port function rate set pci/0000:4b:00.0/2 tx_share 500Mbps 377