1.. SPDX-License-Identifier: GPL-2.0 OR Linux-OpenIB 2.. include:: <isonum.txt> 3 4================ 5Ethtool counters 6================ 7 8:Copyright: |copy| 2023, NVIDIA CORPORATION & AFFILIATES. All rights reserved. 9 10Contents 11======== 12 13- `Overview`_ 14- `Groups`_ 15- `Types`_ 16- `Descriptions`_ 17 18Overview 19======== 20 21There are several counter groups based on where the counter is being counted. In 22addition, each group of counters may have different counter types. 23 24These counter groups are based on which component in a networking setup, 25illustrated below, that they describe:: 26 27 ---------------------------------------- 28 | | 29 ---------------------------------------- ---------------------------------------- | 30 | Hypervisor | | VM | | 31 | | | | | 32 | ------------------- --------------- | | ------------------- --------------- | | 33 | | Ethernet driver | | RDMA driver | | | | Ethernet driver | | RDMA driver | | | 34 | ------------------- --------------- | | ------------------- --------------- | | 35 | | | | | | | | | 36 | ------------------- | | ------------------- | | 37 | | | | | |-- 38 ---------------------------------------- ---------------------------------------- 39 | | 40 ------------- ----------------------------- 41 | | 42 ------ ------ ------ ------ ------ ------ ------ 43 -----| PF |----------------------| VF |-| VF |-| VF |----- --| PF |--- --| PF |--- --| PF |--- 44 | ------ ------ ------ ------ | | ------ | | ------ | | ------ | 45 | | | | | | | | 46 | | | | | | | | 47 | | | | | | | | 48 | eSwitch | | eSwitch | | eSwitch | | eSwitch | 49 ---------------------------------------------------------- ----------- ----------- ----------- 50 ------------------------------------------------------------------------------- 51 | | 52 | | 53 | Uplink (no counters) | 54 ------------------------------------------------------------------------------- 55 --------------------------------------------------------------- 56 | | 57 | | 58 | MPFS (no counters) | 59 --------------------------------------------------------------- 60 | 61 | 62 | Port 63 64Groups 65====== 66 67Ring 68 Software counters populated by the driver stack. 69 70Netdev 71 An aggregation of software ring counters. 72 73vPort counters 74 Traffic counters and drops due to steering or no buffers. May indicate issues 75 with NIC. These counters include Ethernet traffic counters (including Raw 76 Ethernet) and RDMA/RoCE traffic counters. 77 78Physical port counters 79 Counters that collect statistics about the PFs and VFs. May indicate issues 80 with NIC, link, or network. This measuring point holds information on 81 standardized counters like IEEE 802.3, RFC2863, RFC 2819, RFC 3635 and 82 additional counters like flow control, FEC and more. Physical port counters 83 are not exposed to virtual machines. 84 85Priority Port Counters 86 A set of the physical port counters, per priority per port. 87 88Types 89===== 90 91Counters are divided into three types. 92 93Traffic Informative Counters 94 Counters which count traffic. These counters can be used for load estimation 95 or for general debug. 96 97Traffic Acceleration Counters 98 Counters which count traffic that was accelerated by Mellanox driver or by 99 hardware. The counters are an additional layer to the informative counter set, 100 and the same traffic is counted in both informative and acceleration counters. 101 102.. [#accel] Traffic acceleration counter. 103 104Error Counters 105 Increment of these counters might indicate a problem. Each of these counters 106 has an explanation and correction action. 107 108Statistic can be fetched via the `ip link` or `ethtool` commands. `ethtool` 109provides more detailed information.:: 110 111 ip –s link show <if-name> 112 ethtool -S <if-name> 113 114Descriptions 115============ 116 117XSK, PTP, and QoS counters that are similar to counters defined previously will 118not be separately listed. For example, `ptp_tx[i]_packets` will not be 119explicitly documented since `tx[i]_packets` describes the behavior of both 120counters, except `ptp_tx[i]_packets` is only counted when precision time 121protocol is used. 122 123Ring / Netdev Counter 124---------------------------- 125The following counters are available per ring or software port. 126 127These counters provide information on the amount of traffic that was accelerated 128by the NIC. The counters are counting the accelerated traffic in addition to the 129standard counters which counts it (i.e. accelerated traffic is counted twice). 130 131The counter names in the table below refers to both ring and port counters. The 132notation for ring counters includes the [i] index without the braces. The 133notation for port counters doesn't include the [i]. A counter name 134`rx[i]_packets` will be printed as `rx0_packets` for ring 0 and `rx_packets` for 135the software port. 136 137.. flat-table:: Ring / Software Port Counter Table 138 :widths: 2 3 1 139 140 * - Counter 141 - Description 142 - Type 143 144 * - `rx[i]_packets` 145 - The number of packets received on ring i. 146 - Informative 147 148 * - `rx[i]_bytes` 149 - The number of bytes received on ring i. 150 - Informative 151 152 * - `tx[i]_packets` 153 - The number of packets transmitted on ring i. 154 - Informative 155 156 * - `tx[i]_bytes` 157 - The number of bytes transmitted on ring i. 158 - Informative 159 160 * - `tx[i]_recover` 161 - The number of times the SQ was recovered. 162 - Error 163 164 * - `tx[i]_cqes` 165 - Number of CQEs events on SQ issued on ring i. 166 - Informative 167 168 * - `tx[i]_cqe_err` 169 - The number of error CQEs encountered on the SQ for ring i. 170 - Error 171 172 * - `tx[i]_tso_packets` 173 - The number of TSO packets transmitted on ring i [#accel]_. 174 - Acceleration 175 176 * - `tx[i]_tso_bytes` 177 - The number of TSO bytes transmitted on ring i [#accel]_. 178 - Acceleration 179 180 * - `tx[i]_tso_inner_packets` 181 - The number of TSO packets which are indicated to be carry internal 182 encapsulation transmitted on ring i [#accel]_. 183 - Acceleration 184 185 * - `tx[i]_tso_inner_bytes` 186 - The number of TSO bytes which are indicated to be carry internal 187 encapsulation transmitted on ring i [#accel]_. 188 - Acceleration 189 190 * - `rx[i]_gro_packets` 191 - Number of received packets processed using hardware-accelerated GRO. The 192 number of hardware GRO offloaded packets received on ring i. 193 - Acceleration 194 195 * - `rx[i]_gro_bytes` 196 - Number of received bytes processed using hardware-accelerated GRO. The 197 number of hardware GRO offloaded bytes received on ring i. 198 - Acceleration 199 200 * - `rx[i]_gro_skbs` 201 - The number of receive SKBs constructed while performing 202 hardware-accelerated GRO. 203 - Informative 204 205 * - `rx[i]_gro_match_packets` 206 - Number of received packets processed using hardware-accelerated GRO that 207 met the flow table match criteria. 208 - Informative 209 210 * - `rx[i]_gro_large_hds` 211 - Number of receive packets using hardware-accelerated GRO that have large 212 headers that require additional memory to be allocated. 213 - Informative 214 215 * - `rx[i]_lro_packets` 216 - The number of LRO packets received on ring i [#accel]_. 217 - Acceleration 218 219 * - `rx[i]_lro_bytes` 220 - The number of LRO bytes received on ring i [#accel]_. 221 - Acceleration 222 223 * - `rx[i]_ecn_mark` 224 - The number of received packets where the ECN mark was turned on. 225 - Informative 226 227 * - `rx_oversize_pkts_buffer` 228 - The number of dropped received packets due to length which arrived to RQ 229 and exceed software buffer size allocated by the device for incoming 230 traffic. It might imply that the device MTU is larger than the software 231 buffers size. 232 - Error 233 234 * - `rx_oversize_pkts_sw_drop` 235 - Number of received packets dropped in software because the CQE data is 236 larger than the MTU size. 237 - Error 238 239 * - `rx[i]_csum_unnecessary` 240 - Packets received with a `CHECKSUM_UNNECESSARY` on ring i [#accel]_. 241 - Acceleration 242 243 * - `rx[i]_csum_unnecessary_inner` 244 - Packets received with inner encapsulation with a `CHECKSUM_UNNECESSARY` 245 on ring i [#accel]_. 246 - Acceleration 247 248 * - `rx[i]_csum_none` 249 - Packets received with a `CHECKSUM_NONE` on ring i [#accel]_. 250 - Acceleration 251 252 * - `rx[i]_csum_complete` 253 - Packets received with a `CHECKSUM_COMPLETE` on ring i [#accel]_. 254 - Acceleration 255 256 * - `rx[i]_csum_complete_tail` 257 - Number of received packets that had checksum calculation computed, 258 potentially needed padding, and were able to do so with 259 `CHECKSUM_PARTIAL`. 260 - Informative 261 262 * - `rx[i]_csum_complete_tail_slow` 263 - Number of received packets that need padding larger than eight bytes for 264 the checksum. 265 - Informative 266 267 * - `tx[i]_csum_partial` 268 - Packets transmitted with a `CHECKSUM_PARTIAL` on ring i [#accel]_. 269 - Acceleration 270 271 * - `tx[i]_csum_partial_inner` 272 - Packets transmitted with inner encapsulation with a `CHECKSUM_PARTIAL` on 273 ring i [#accel]_. 274 - Acceleration 275 276 * - `tx[i]_csum_none` 277 - Packets transmitted with no hardware checksum acceleration on ring i. 278 - Informative 279 280 * - `tx[i]_stopped` / `tx_queue_stopped` [#ring_global]_ 281 - Events where SQ was full on ring i. If this counter is increased, check 282 the amount of buffers allocated for transmission. 283 - Informative 284 285 * - `tx[i]_wake` / `tx_queue_wake` [#ring_global]_ 286 - Events where SQ was full and has become not full on ring i. 287 - Informative 288 289 * - `tx[i]_dropped` / `tx_queue_dropped` [#ring_global]_ 290 - Packets transmitted that were dropped due to DMA mapping failure on 291 ring i. If this counter is increased, check the amount of buffers 292 allocated for transmission. 293 - Error 294 295 * - `tx[i]_nop` 296 - The number of nop WQEs (empty WQEs) inserted to the SQ (related to 297 ring i) due to the reach of the end of the cyclic buffer. When reaching 298 near to the end of cyclic buffer the driver may add those empty WQEs to 299 avoid handling a state the a WQE start in the end of the queue and ends 300 in the beginning of the queue. This is a normal condition. 301 - Informative 302 303 * - `tx[i]_added_vlan_packets` 304 - The number of packets sent where vlan tag insertion was offloaded to the 305 hardware. 306 - Acceleration 307 308 * - `rx[i]_removed_vlan_packets` 309 - The number of packets received where vlan tag stripping was offloaded to 310 the hardware. 311 - Acceleration 312 313 * - `rx[i]_wqe_err` 314 - The number of wrong opcodes received on ring i. 315 - Error 316 317 * - `rx[i]_mpwqe_frag` 318 - The number of WQEs that failed to allocate compound page and hence 319 fragmented MPWQE’s (Multi Packet WQEs) were used on ring i. If this 320 counter raise, it may suggest that there is no enough memory for large 321 pages, the driver allocated fragmented pages. This is not abnormal 322 condition. 323 - Informative 324 325 * - `rx[i]_mpwqe_filler_cqes` 326 - The number of filler CQEs events that were issued on ring i. 327 - Informative 328 329 * - `rx[i]_mpwqe_filler_strides` 330 - The number of strides consumed by filler CQEs on ring i. 331 - Informative 332 333 * - `tx[i]_mpwqe_blks` 334 - The number of send blocks processed from Multi-Packet WQEs (mpwqe). 335 - Informative 336 337 * - `tx[i]_mpwqe_pkts` 338 - The number of send packets processed from Multi-Packet WQEs (mpwqe). 339 - Informative 340 341 * - `rx[i]_cqe_compress_blks` 342 - The number of receive blocks with CQE compression on ring i [#accel]_. 343 - Acceleration 344 345 * - `rx[i]_cqe_compress_pkts` 346 - The number of receive packets with CQE compression on ring i [#accel]_. 347 - Acceleration 348 349 * - `rx[i]_cache_reuse` 350 - The number of events of successful reuse of a page from a driver's 351 internal page cache. 352 - Acceleration 353 354 * - `rx[i]_cache_full` 355 - The number of events of full internal page cache where driver can't put a 356 page back to the cache for recycling (page will be freed). 357 - Acceleration 358 359 * - `rx[i]_cache_empty` 360 - The number of events where cache was empty - no page to give. Driver 361 shall allocate new page. 362 - Acceleration 363 364 * - `rx[i]_cache_busy` 365 - The number of events where cache head was busy and cannot be recycled. 366 Driver allocated new page. 367 - Acceleration 368 369 * - `rx[i]_cache_waive` 370 - The number of cache evacuation. This can occur due to page move to 371 another NUMA node or page was pfmemalloc-ed and should be freed as soon 372 as possible. 373 - Acceleration 374 375 * - `rx[i]_arfs_err` 376 - Number of flow rules that failed to be added to the flow table. 377 - Error 378 379 * - `rx[i]_recover` 380 - The number of times the RQ was recovered. 381 - Error 382 383 * - `tx[i]_xmit_more` 384 - The number of packets sent with `xmit_more` indication set on the skbuff 385 (no doorbell). 386 - Acceleration 387 388 * - `ch[i]_poll` 389 - The number of invocations of NAPI poll of channel i. 390 - Informative 391 392 * - `ch[i]_arm` 393 - The number of times the NAPI poll function completed and armed the 394 completion queues on channel i. 395 - Informative 396 397 * - `ch[i]_aff_change` 398 - The number of times the NAPI poll function explicitly stopped execution 399 on a CPU due to a change in affinity, on channel i. 400 - Informative 401 402 * - `ch[i]_events` 403 - The number of hard interrupt events on the completion queues of channel i. 404 - Informative 405 406 * - `ch[i]_eq_rearm` 407 - The number of times the EQ was recovered. 408 - Error 409 410 * - `ch[i]_force_irq` 411 - Number of times NAPI is triggered by XSK wakeups by posting a NOP to 412 ICOSQ. 413 - Acceleration 414 415 * - `rx[i]_congst_umr` 416 - The number of times an outstanding UMR request is delayed due to 417 congestion, on ring i. 418 - Informative 419 420 * - `rx_pp_alloc_fast` 421 - Number of successful fast path allocations. 422 - Informative 423 424 * - `rx_pp_alloc_slow` 425 - Number of slow path order-0 allocations. 426 - Informative 427 428 * - `rx_pp_alloc_slow_high_order` 429 - Number of slow path high order allocations. 430 - Informative 431 432 * - `rx_pp_alloc_empty` 433 - Counter is incremented when ptr ring is empty, so a slow path allocation 434 was forced. 435 - Informative 436 437 * - `rx_pp_alloc_refill` 438 - Counter is incremented when an allocation which triggered a refill of the 439 cache. 440 - Informative 441 442 * - `rx_pp_alloc_waive` 443 - Counter is incremented when pages obtained from the ptr ring that cannot 444 be added to the cache due to a NUMA mismatch. 445 - Informative 446 447 * - `rx_pp_recycle_cached` 448 - Counter is incremented when recycling placed page in the page pool cache. 449 - Informative 450 451 * - `rx_pp_recycle_cache_full` 452 - Counter is incremented when page pool cache was full. 453 - Informative 454 455 * - `rx_pp_recycle_ring` 456 - Counter is incremented when page placed into the ptr ring. 457 - Informative 458 459 * - `rx_pp_recycle_ring_full` 460 - Counter is incremented when page released from page pool because the ptr 461 ring was full. 462 - Informative 463 464 * - `rx_pp_recycle_released_ref` 465 - Counter is incremented when page released (and not recycled) because 466 refcnt > 1. 467 - Informative 468 469 * - `rx[i]_xsk_buff_alloc_err` 470 - The number of times allocating an skb or XSK buffer failed in the XSK RQ 471 context. 472 - Error 473 474 * - `rx[i]_xsk_arfs_err` 475 - aRFS (accelerated Receive Flow Steering) does not occur in the XSK RQ 476 context, so this counter should never increment. 477 - Error 478 479 * - `rx[i]_xdp_tx_xmit` 480 - The number of packets forwarded back to the port due to XDP program 481 `XDP_TX` action (bouncing). these packets are not counted by other 482 software counters. These packets are counted by physical port and vPort 483 counters. 484 - Informative 485 486 * - `rx[i]_xdp_tx_mpwqe` 487 - Number of multi-packet WQEs transmitted by the netdev and `XDP_TX`-ed by 488 the netdev during the RQ context. 489 - Acceleration 490 491 * - `rx[i]_xdp_tx_inlnw` 492 - Number of WQE data segments transmitted where the data could be inlined 493 in the WQE and then `XDP_TX`-ed during the RQ context. 494 - Acceleration 495 496 * - `rx[i]_xdp_tx_nops` 497 - Number of NOP WQEBBs (WQE building blocks) received posted to the XDP SQ. 498 - Acceleration 499 500 * - `rx[i]_xdp_tx_full` 501 - The number of packets that should have been forwarded back to the port 502 due to `XDP_TX` action but were dropped due to full tx queue. These packets 503 are not counted by other software counters. These packets are counted by 504 physical port and vPort counters. You may open more rx queues and spread 505 traffic rx over all queues and/or increase rx ring size. 506 - Error 507 508 * - `rx[i]_xdp_tx_err` 509 - The number of times an `XDP_TX` error such as frame too long and frame 510 too short occurred on `XDP_TX` ring of RX ring. 511 - Error 512 513 * - `rx[i]_xdp_tx_cqes` / `rx_xdp_tx_cqe` [#ring_global]_ 514 - The number of completions received on the CQ of the `XDP_TX` ring. 515 - Informative 516 517 * - `rx[i]_xdp_drop` 518 - The number of packets dropped due to XDP program `XDP_DROP` action. these 519 packets are not counted by other software counters. These packets are 520 counted by physical port and vPort counters. 521 - Informative 522 523 * - `rx[i]_xdp_redirect` 524 - The number of times an XDP redirect action was triggered on ring i. 525 - Acceleration 526 527 * - `tx[i]_xdp_xmit` 528 - The number of packets redirected to the interface(due to XDP redirect). 529 These packets are not counted by other software counters. These packets 530 are counted by physical port and vPort counters. 531 - Informative 532 533 * - `tx[i]_xdp_full` 534 - The number of packets redirected to the interface(due to XDP redirect), 535 but were dropped due to full tx queue. these packets are not counted by 536 other software counters. you may enlarge tx queues. 537 - Informative 538 539 * - `tx[i]_xdp_mpwqe` 540 - Number of multi-packet WQEs offloaded onto the NIC that were 541 `XDP_REDIRECT`-ed from other netdevs. 542 - Acceleration 543 544 * - `tx[i]_xdp_inlnw` 545 - Number of WQE data segments where the data could be inlined in the WQE 546 where the data segments were `XDP_REDIRECT`-ed from other netdevs. 547 - Acceleration 548 549 * - `tx[i]_xdp_nops` 550 - Number of NOP WQEBBs (WQE building blocks) posted to the SQ that were 551 `XDP_REDIRECT`-ed from other netdevs. 552 - Acceleration 553 554 * - `tx[i]_xdp_err` 555 - The number of packets redirected to the interface(due to XDP redirect) 556 but were dropped due to error such as frame too long and frame too short. 557 - Error 558 559 * - `tx[i]_xdp_cqes` 560 - The number of completions received for packets redirected to the 561 interface(due to XDP redirect) on the CQ. 562 - Informative 563 564 * - `tx[i]_xsk_xmit` 565 - The number of packets transmitted using XSK zerocopy functionality. 566 - Acceleration 567 568 * - `tx[i]_xsk_mpwqe` 569 - Number of multi-packet WQEs offloaded onto the NIC that were 570 `XDP_REDIRECT`-ed from other netdevs. 571 - Acceleration 572 573 * - `tx[i]_xsk_inlnw` 574 - Number of WQE data segments where the data could be inlined in the WQE 575 that are transmitted using XSK zerocopy. 576 - Acceleration 577 578 * - `tx[i]_xsk_full` 579 - Number of times doorbell is rung in XSK zerocopy mode when SQ is full. 580 - Error 581 582 * - `tx[i]_xsk_err` 583 - Number of errors that occurred in XSK zerocopy mode such as if the data 584 size is larger than the MTU size. 585 - Error 586 587 * - `tx[i]_xsk_cqes` 588 - Number of CQEs processed in XSK zerocopy mode. 589 - Acceleration 590 591 * - `tx_tls_ctx` 592 - Number of TLS TX HW offload contexts added to device for encryption. 593 - Acceleration 594 595 * - `tx_tls_del` 596 - Number of TLS TX HW offload contexts removed from device (connection 597 closed). 598 - Acceleration 599 600 * - `tx_tls_pool_alloc` 601 - Number of times a unit of work is successfully allocated in the TLS HW 602 offload pool. 603 - Acceleration 604 605 * - `tx_tls_pool_free` 606 - Number of times a unit of work is freed in the TLS HW offload pool. 607 - Acceleration 608 609 * - `rx_tls_ctx` 610 - Number of TLS RX HW offload contexts added to device for decryption. 611 - Acceleration 612 613 * - `rx_tls_del` 614 - Number of TLS RX HW offload contexts deleted from device (connection has 615 finished). 616 - Acceleration 617 618 * - `rx[i]_tls_decrypted_packets` 619 - Number of successfully decrypted RX packets which were part of a TLS 620 stream. 621 - Acceleration 622 623 * - `rx[i]_tls_decrypted_bytes` 624 - Number of TLS payload bytes in RX packets which were successfully 625 decrypted. 626 - Acceleration 627 628 * - `rx[i]_tls_resync_req_pkt` 629 - Number of received TLS packets with a resync request. 630 - Acceleration 631 632 * - `rx[i]_tls_resync_req_start` 633 - Number of times the TLS async resync request was started. 634 - Acceleration 635 636 * - `rx[i]_tls_resync_req_end` 637 - Number of times the TLS async resync request properly ended with 638 providing the HW tracked tcp-seq. 639 - Acceleration 640 641 * - `rx[i]_tls_resync_req_skip` 642 - Number of times the TLS async resync request procedure was started but 643 not properly ended. 644 - Error 645 646 * - `rx[i]_tls_resync_res_ok` 647 - Number of times the TLS resync response call to the driver was 648 successfully handled. 649 - Acceleration 650 651 * - `rx[i]_tls_resync_res_retry` 652 - Number of times the TLS resync response call to the driver was 653 reattempted when ICOSQ is full. 654 - Error 655 656 * - `rx[i]_tls_resync_res_skip` 657 - Number of times the TLS resync response call to the driver was terminated 658 unsuccessfully. 659 - Error 660 661 * - `rx[i]_tls_err` 662 - Number of times when CQE TLS offload was problematic. 663 - Error 664 665 * - `tx[i]_tls_encrypted_packets` 666 - The number of send packets that are TLS encrypted by the kernel. 667 - Acceleration 668 669 * - `tx[i]_tls_encrypted_bytes` 670 - The number of send bytes that are TLS encrypted by the kernel. 671 - Acceleration 672 673 * - `tx[i]_tls_ooo` 674 - Number of times out of order TLS SQE fragments were handled on ring i. 675 - Acceleration 676 677 * - `tx[i]_tls_dump_packets` 678 - Number of TLS decrypted packets copied over from NIC over DMA. 679 - Acceleration 680 681 * - `tx[i]_tls_dump_bytes` 682 - Number of TLS decrypted bytes copied over from NIC over DMA. 683 - Acceleration 684 685 * - `tx[i]_tls_resync_bytes` 686 - Number of TLS bytes requested to be resynchronized in order to be 687 decrypted. 688 - Acceleration 689 690 * - `tx[i]_tls_skip_no_sync_data` 691 - Number of TLS send data that can safely be skipped / do not need to be 692 decrypted. 693 - Acceleration 694 695 * - `tx[i]_tls_drop_no_sync_data` 696 - Number of TLS send data that were dropped due to retransmission of TLS 697 data. 698 - Acceleration 699 700 * - `ptp_cq[i]_abort` 701 - Number of times a CQE has to be skipped in precision time protocol due to 702 a skew between the port timestamp and CQE timestamp being greater than 703 128 seconds. 704 - Error 705 706 * - `ptp_cq[i]_abort_abs_diff_ns` 707 - Accumulation of time differences between the port timestamp and CQE 708 timestamp when the difference is greater than 128 seconds in precision 709 time protocol. 710 - Error 711 712.. [#ring_global] The corresponding ring and global counters do not share the 713 same name (i.e. do not follow the common naming scheme). 714 715vPort Counters 716-------------- 717Counters on the NIC port that is connected to a eSwitch. 718 719.. flat-table:: vPort Counter Table 720 :widths: 2 3 1 721 722 * - Counter 723 - Description 724 - Type 725 726 * - `rx_vport_unicast_packets` 727 - Unicast packets received, steered to a port including Raw Ethernet 728 QP/DPDK traffic, excluding RDMA traffic. 729 - Informative 730 731 * - `rx_vport_unicast_bytes` 732 - Unicast bytes received, steered to a port including Raw Ethernet QP/DPDK 733 traffic, excluding RDMA traffic. 734 - Informative 735 736 * - `tx_vport_unicast_packets` 737 - Unicast packets transmitted, steered from a port including Raw Ethernet 738 QP/DPDK traffic, excluding RDMA traffic. 739 - Informative 740 741 * - `tx_vport_unicast_bytes` 742 - Unicast bytes transmitted, steered from a port including Raw Ethernet 743 QP/DPDK traffic, excluding RDMA traffic. 744 - Informative 745 746 * - `rx_vport_multicast_packets` 747 - Multicast packets received, steered to a port including Raw Ethernet 748 QP/DPDK traffic, excluding RDMA traffic. 749 - Informative 750 751 * - `rx_vport_multicast_bytes` 752 - Multicast bytes received, steered to a port including Raw Ethernet 753 QP/DPDK traffic, excluding RDMA traffic. 754 - Informative 755 756 * - `tx_vport_multicast_packets` 757 - Multicast packets transmitted, steered from a port including Raw Ethernet 758 QP/DPDK traffic, excluding RDMA traffic. 759 - Informative 760 761 * - `tx_vport_multicast_bytes` 762 - Multicast bytes transmitted, steered from a port including Raw Ethernet 763 QP/DPDK traffic, excluding RDMA traffic. 764 - Informative 765 766 * - `rx_vport_broadcast_packets` 767 - Broadcast packets received, steered to a port including Raw Ethernet 768 QP/DPDK traffic, excluding RDMA traffic. 769 - Informative 770 771 * - `rx_vport_broadcast_bytes` 772 - Broadcast bytes received, steered to a port including Raw Ethernet 773 QP/DPDK traffic, excluding RDMA traffic. 774 - Informative 775 776 * - `tx_vport_broadcast_packets` 777 - Broadcast packets transmitted, steered from a port including Raw Ethernet 778 QP/DPDK traffic, excluding RDMA traffic. 779 - Informative 780 781 * - `tx_vport_broadcast_bytes` 782 - Broadcast bytes transmitted, steered from a port including Raw Ethernet 783 QP/DPDK traffic, excluding RDMA traffic. 784 - Informative 785 786 * - `rx_vport_rdma_unicast_packets` 787 - RDMA unicast packets received, steered to a port (counters counts 788 RoCE/UD/RC traffic) [#accel]_. 789 - Acceleration 790 791 * - `rx_vport_rdma_unicast_bytes` 792 - RDMA unicast bytes received, steered to a port (counters counts 793 RoCE/UD/RC traffic) [#accel]_. 794 - Acceleration 795 796 * - `tx_vport_rdma_unicast_packets` 797 - RDMA unicast packets transmitted, steered from a port (counters counts 798 RoCE/UD/RC traffic) [#accel]_. 799 - Acceleration 800 801 * - `tx_vport_rdma_unicast_bytes` 802 - RDMA unicast bytes transmitted, steered from a port (counters counts 803 RoCE/UD/RC traffic) [#accel]_. 804 - Acceleration 805 806 * - `rx_vport_rdma_multicast_packets` 807 - RDMA multicast packets received, steered to a port (counters counts 808 RoCE/UD/RC traffic) [#accel]_. 809 - Acceleration 810 811 * - `rx_vport_rdma_multicast_bytes` 812 - RDMA multicast bytes received, steered to a port (counters counts 813 RoCE/UD/RC traffic) [#accel]_. 814 - Acceleration 815 816 * - `tx_vport_rdma_multicast_packets` 817 - RDMA multicast packets transmitted, steered from a port (counters counts 818 RoCE/UD/RC traffic) [#accel]_. 819 - Acceleration 820 821 * - `tx_vport_rdma_multicast_bytes` 822 - RDMA multicast bytes transmitted, steered from a port (counters counts 823 RoCE/UD/RC traffic) [#accel]_. 824 - Acceleration 825 826 * - `rx_steer_missed_packets` 827 - Number of packets that was received by the NIC, however was discarded 828 because it did not match any flow in the NIC flow table. 829 - Error 830 831 * - `rx_packets` 832 - Representor only: packets received, that were handled by the hypervisor. 833 - Informative 834 835 * - `rx_bytes` 836 - Representor only: bytes received, that were handled by the hypervisor. 837 - Informative 838 839 * - `tx_packets` 840 - Representor only: packets transmitted, that were handled by the 841 hypervisor. 842 - Informative 843 844 * - `tx_bytes` 845 - Representor only: bytes transmitted, that were handled by the hypervisor. 846 - Informative 847 848 * - `dev_internal_queue_oob` 849 - The number of dropped packets due to lack of receive WQEs for an internal 850 device RQ. 851 - Error 852 853Physical Port Counters 854---------------------- 855The physical port counters are the counters on the external port connecting the 856adapter to the network. This measuring point holds information on standardized 857counters like IEEE 802.3, RFC2863, RFC 2819, RFC 3635 and additional counters 858like flow control, FEC and more. 859 860.. flat-table:: Physical Port Counter Table 861 :widths: 2 3 1 862 863 * - Counter 864 - Description 865 - Type 866 867 * - `rx_packets_phy` 868 - The number of packets received on the physical port. This counter doesn’t 869 include packets that were discarded due to FCS, frame size and similar 870 errors. 871 - Informative 872 873 * - `tx_packets_phy` 874 - The number of packets transmitted on the physical port. 875 - Informative 876 877 * - `rx_bytes_phy` 878 - The number of bytes received on the physical port, including Ethernet 879 header and FCS. 880 - Informative 881 882 * - `tx_bytes_phy` 883 - The number of bytes transmitted on the physical port. 884 - Informative 885 886 * - `rx_multicast_phy` 887 - The number of multicast packets received on the physical port. 888 - Informative 889 890 * - `tx_multicast_phy` 891 - The number of multicast packets transmitted on the physical port. 892 - Informative 893 894 * - `rx_broadcast_phy` 895 - The number of broadcast packets received on the physical port. 896 - Informative 897 898 * - `tx_broadcast_phy` 899 - The number of broadcast packets transmitted on the physical port. 900 - Informative 901 902 * - `rx_crc_errors_phy` 903 - The number of dropped received packets due to FCS (Frame Check Sequence) 904 error on the physical port. If this counter is increased in high rate, 905 check the link quality using `rx_symbol_error_phy` and 906 `rx_corrected_bits_phy` counters below. 907 - Error 908 909 * - `rx_in_range_len_errors_phy` 910 - The number of received packets dropped due to length/type errors on a 911 physical port. 912 - Error 913 914 * - `rx_out_of_range_len_phy` 915 - The number of received packets dropped due to length greater than allowed 916 on a physical port. If this counter is increasing, it implies that the 917 peer connected to the adapter has a larger MTU configured. Using same MTU 918 configuration shall resolve this issue. 919 - Error 920 921 * - `rx_oversize_pkts_phy` 922 - The number of dropped received packets due to length which exceed MTU 923 size on a physical port. If this counter is increasing, it implies that 924 the peer connected to the adapter has a larger MTU configured. Using same 925 MTU configuration shall resolve this issue. 926 - Error 927 928 * - `rx_symbol_err_phy` 929 - The number of received packets dropped due to physical coding errors 930 (symbol errors) on a physical port. 931 - Error 932 933 * - `rx_mac_control_phy` 934 - The number of MAC control packets received on the physical port. 935 - Informative 936 937 * - `tx_mac_control_phy` 938 - The number of MAC control packets transmitted on the physical port. 939 - Informative 940 941 * - `rx_pause_ctrl_phy` 942 - The number of link layer pause packets received on a physical port. If 943 this counter is increasing, it implies that the network is congested and 944 cannot absorb the traffic coming from to the adapter. 945 - Informative 946 947 * - `tx_pause_ctrl_phy` 948 - The number of link layer pause packets transmitted on a physical port. If 949 this counter is increasing, it implies that the NIC is congested and 950 cannot absorb the traffic coming from the network. 951 - Informative 952 953 * - `rx_unsupported_op_phy` 954 - The number of MAC control packets received with unsupported opcode on a 955 physical port. 956 - Error 957 958 * - `rx_discards_phy` 959 - The number of received packets dropped due to lack of buffers on a 960 physical port. If this counter is increasing, it implies that the adapter 961 is congested and cannot absorb the traffic coming from the network. 962 - Error 963 964 * - `tx_discards_phy` 965 - The number of packets which were discarded on transmission, even no 966 errors were detected. the drop might occur due to link in down state, 967 head of line drop, pause from the network, etc. 968 - Error 969 970 * - `tx_errors_phy` 971 - The number of transmitted packets dropped due to a length which exceed 972 MTU size on a physical port. 973 - Error 974 975 * - `rx_undersize_pkts_phy` 976 - The number of received packets dropped due to length which is shorter 977 than 64 bytes on a physical port. If this counter is increasing, it 978 implies that the peer connected to the adapter has a non-standard MTU 979 configured or malformed packet had arrived. 980 - Error 981 982 * - `rx_fragments_phy` 983 - The number of received packets dropped due to a length which is shorter 984 than 64 bytes and has FCS error on a physical port. If this counter is 985 increasing, it implies that the peer connected to the adapter has a 986 non-standard MTU configured. 987 - Error 988 989 * - `rx_jabbers_phy` 990 - The number of received packets d due to a length which is longer than 64 991 bytes and had FCS error on a physical port. 992 - Error 993 994 * - `rx_64_bytes_phy` 995 - The number of packets received on the physical port with size of 64 bytes. 996 - Informative 997 998 * - `rx_65_to_127_bytes_phy` 999 - The number of packets received on the physical port with size of 65 to 1000 127 bytes. 1001 - Informative 1002 1003 * - `rx_128_to_255_bytes_phy` 1004 - The number of packets received on the physical port with size of 128 to 1005 255 bytes. 1006 - Informative 1007 1008 * - `rx_256_to_511_bytes_phy` 1009 - The number of packets received on the physical port with size of 256 to 1010 512 bytes. 1011 - Informative 1012 1013 * - `rx_512_to_1023_bytes_phy` 1014 - The number of packets received on the physical port with size of 512 to 1015 1023 bytes. 1016 - Informative 1017 1018 * - `rx_1024_to_1518_bytes_phy` 1019 - The number of packets received on the physical port with size of 1024 to 1020 1518 bytes. 1021 - Informative 1022 1023 * - `rx_1519_to_2047_bytes_phy` 1024 - The number of packets received on the physical port with size of 1519 to 1025 2047 bytes. 1026 - Informative 1027 1028 * - `rx_2048_to_4095_bytes_phy` 1029 - The number of packets received on the physical port with size of 2048 to 1030 4095 bytes. 1031 - Informative 1032 1033 * - `rx_4096_to_8191_bytes_phy` 1034 - The number of packets received on the physical port with size of 4096 to 1035 8191 bytes. 1036 - Informative 1037 1038 * - `rx_8192_to_10239_bytes_phy` 1039 - The number of packets received on the physical port with size of 8192 to 1040 10239 bytes. 1041 - Informative 1042 1043 * - `link_down_events_phy` 1044 - The number of times where the link operative state changed to down. In 1045 case this counter is increasing it may imply on port flapping. You may 1046 need to replace the cable/transceiver. 1047 - Error 1048 1049 * - `rx_out_of_buffer` 1050 - Number of times receive queue had no software buffers allocated for the 1051 adapter's incoming traffic. 1052 - Error 1053 1054 * - `module_bus_stuck` 1055 - The number of times that module's I\ :sup:`2`\C bus (data or clock) 1056 short-wire was detected. You may need to replace the cable/transceiver. 1057 - Error 1058 1059 * - `module_high_temp` 1060 - The number of times that the module temperature was too high. If this 1061 issue persist, you may need to check the ambient temperature or replace 1062 the cable/transceiver module. 1063 - Error 1064 1065 * - `module_bad_shorted` 1066 - The number of times that the module cables were shorted. You may need to 1067 replace the cable/transceiver module. 1068 - Error 1069 1070 * - `module_unplug` 1071 - The number of times that module was ejected. 1072 - Informative 1073 1074 * - `rx_buffer_passed_thres_phy` 1075 - The number of events where the port receive buffer was over 85% full. 1076 - Informative 1077 1078 * - `tx_pause_storm_warning_events` 1079 - The number of times the device was sending pauses for a long period of 1080 time. 1081 - Informative 1082 1083 * - `tx_pause_storm_error_events` 1084 - The number of times the device was sending pauses for a long period of 1085 time, reaching time out and disabling transmission of pause frames. on 1086 the period where pause frames were disabled, drop could have been 1087 occurred. 1088 - Error 1089 1090 * - `rx[i]_buff_alloc_err` 1091 - Failed to allocate a buffer to received packet (or SKB) on ring i. 1092 - Error 1093 1094 * - `rx_bits_phy` 1095 - This counter provides information on the total amount of traffic that 1096 could have been received and can be used as a guideline to measure the 1097 ratio of errored traffic in `rx_pcs_symbol_err_phy` and 1098 `rx_corrected_bits_phy`. 1099 - Informative 1100 1101 * - `rx_pcs_symbol_err_phy` 1102 - This counter counts the number of symbol errors that wasn’t corrected by 1103 FEC correction algorithm or that FEC algorithm was not active on this 1104 interface. If this counter is increasing, it implies that the link 1105 between the NIC and the network is suffering from high BER, and that 1106 traffic is lost. You may need to replace the cable/transceiver. The error 1107 rate is the number of `rx_pcs_symbol_err_phy` divided by the number of 1108 `rx_bits_phy` on a specific time frame. 1109 - Error 1110 1111 * - `rx_corrected_bits_phy` 1112 - The number of corrected bits on this port according to active FEC 1113 (RS/FC). If this counter is increasing, it implies that the link between 1114 the NIC and the network is suffering from high BER. The corrected bit 1115 rate is the number of `rx_corrected_bits_phy` divided by the number of 1116 `rx_bits_phy` on a specific time frame. 1117 - Error 1118 1119 * - `rx_err_lane_[l]_phy` 1120 - This counter counts the number of physical raw errors per lane l index. 1121 The counter counts errors before FEC corrections. If this counter is 1122 increasing, it implies that the link between the NIC and the network is 1123 suffering from high BER, and that traffic might be lost. You may need to 1124 replace the cable/transceiver. Please check in accordance with 1125 `rx_corrected_bits_phy`. 1126 - Error 1127 1128 * - `rx_global_pause` 1129 - The number of pause packets received on the physical port. If this 1130 counter is increasing, it implies that the network is congested and 1131 cannot absorb the traffic coming from the adapter. Note: This counter is 1132 only enabled when global pause mode is enabled. 1133 - Informative 1134 1135 * - `rx_global_pause_duration` 1136 - The duration of pause received (in microSec) on the physical port. The 1137 counter represents the time the port did not send any traffic. If this 1138 counter is increasing, it implies that the network is congested and 1139 cannot absorb the traffic coming from the adapter. Note: This counter is 1140 only enabled when global pause mode is enabled. 1141 - Informative 1142 1143 * - `tx_global_pause` 1144 - The number of pause packets transmitted on a physical port. If this 1145 counter is increasing, it implies that the adapter is congested and 1146 cannot absorb the traffic coming from the network. Note: This counter is 1147 only enabled when global pause mode is enabled. 1148 - Informative 1149 1150 * - `tx_global_pause_duration` 1151 - The duration of pause transmitter (in microSec) on the physical port. 1152 Note: This counter is only enabled when global pause mode is enabled. 1153 - Informative 1154 1155 * - `rx_global_pause_transition` 1156 - The number of times a transition from Xoff to Xon on the physical port 1157 has occurred. Note: This counter is only enabled when global pause mode 1158 is enabled. 1159 - Informative 1160 1161 * - `rx_if_down_packets` 1162 - The number of received packets that were dropped due to interface down. 1163 - Informative 1164 1165Priority Port Counters 1166---------------------- 1167The following counters are physical port counters that are counted per L2 1168priority (0-7). 1169 1170**Note:** `p` in the counter name represents the priority. 1171 1172.. flat-table:: Priority Port Counter Table 1173 :widths: 2 3 1 1174 1175 * - Counter 1176 - Description 1177 - Type 1178 1179 * - `rx_prio[p]_bytes` 1180 - The number of bytes received with priority p on the physical port. 1181 - Informative 1182 1183 * - `rx_prio[p]_packets` 1184 - The number of packets received with priority p on the physical port. 1185 - Informative 1186 1187 * - `tx_prio[p]_bytes` 1188 - The number of bytes transmitted on priority p on the physical port. 1189 - Informative 1190 1191 * - `tx_prio[p]_packets` 1192 - The number of packets transmitted on priority p on the physical port. 1193 - Informative 1194 1195 * - `rx_prio[p]_pause` 1196 - The number of pause packets received with priority p on a physical port. 1197 If this counter is increasing, it implies that the network is congested 1198 and cannot absorb the traffic coming from the adapter. Note: This counter 1199 is available only if PFC was enabled on priority p. 1200 - Informative 1201 1202 * - `rx_prio[p]_pause_duration` 1203 - The duration of pause received (in microSec) on priority p on the 1204 physical port. The counter represents the time the port did not send any 1205 traffic on this priority. If this counter is increasing, it implies that 1206 the network is congested and cannot absorb the traffic coming from the 1207 adapter. Note: This counter is available only if PFC was enabled on 1208 priority p. 1209 - Informative 1210 1211 * - `rx_prio[p]_pause_transition` 1212 - The number of times a transition from Xoff to Xon on priority p on the 1213 physical port has occurred. Note: This counter is available only if PFC 1214 was enabled on priority p. 1215 - Informative 1216 1217 * - `tx_prio[p]_pause` 1218 - The number of pause packets transmitted on priority p on a physical port. 1219 If this counter is increasing, it implies that the adapter is congested 1220 and cannot absorb the traffic coming from the network. Note: This counter 1221 is available only if PFC was enabled on priority p. 1222 - Informative 1223 1224 * - `tx_prio[p]_pause_duration` 1225 - The duration of pause transmitter (in microSec) on priority p on the 1226 physical port. Note: This counter is available only if PFC was enabled on 1227 priority p. 1228 - Informative 1229 1230 * - `rx_prio[p]_buf_discard` 1231 - The number of packets discarded by device due to lack of per host receive 1232 buffers. 1233 - Informative 1234 1235 * - `rx_prio[p]_cong_discard` 1236 - The number of packets discarded by device due to per host congestion. 1237 - Informative 1238 1239 * - `rx_prio[p]_marked` 1240 - The number of packets ecn marked by device due to per host congestion. 1241 - Informative 1242 1243 * - `rx_prio[p]_discards` 1244 - The number of packets discarded by device due to lack of receive buffers. 1245 - Informative 1246 1247Device Counters 1248--------------- 1249.. flat-table:: Device Counter Table 1250 :widths: 2 3 1 1251 1252 * - Counter 1253 - Description 1254 - Type 1255 1256 * - `rx_pci_signal_integrity` 1257 - Counts physical layer PCIe signal integrity errors, the number of 1258 transitions to recovery due to Framing errors and CRC (dlp and tlp). If 1259 this counter is raising, try moving the adapter card to a different slot 1260 to rule out a bad PCI slot. Validate that you are running with the latest 1261 firmware available and latest server BIOS version. 1262 - Error 1263 1264 * - `tx_pci_signal_integrity` 1265 - Counts physical layer PCIe signal integrity errors, the number of 1266 transition to recovery initiated by the other side (moving to recovery 1267 due to getting TS/EIEOS). If this counter is raising, try moving the 1268 adapter card to a different slot to rule out a bad PCI slot. Validate 1269 that you are running with the latest firmware available and latest server 1270 BIOS version. 1271 - Error 1272 1273 * - `outbound_pci_buffer_overflow` 1274 - The number of packets dropped due to pci buffer overflow. If this counter 1275 is raising in high rate, it might indicate that the receive traffic rate 1276 for a host is larger than the PCIe bus and therefore a congestion occurs. 1277 - Informative 1278 1279 * - `outbound_pci_stalled_rd` 1280 - The percentage (in the range 0...100) of time within the last second that 1281 the NIC had outbound non-posted reads requests but could not perform the 1282 operation due to insufficient posted credits. 1283 - Informative 1284 1285 * - `outbound_pci_stalled_wr` 1286 - The percentage (in the range 0...100) of time within the last second that 1287 the NIC had outbound posted writes requests but could not perform the 1288 operation due to insufficient posted credits. 1289 - Informative 1290 1291 * - `outbound_pci_stalled_rd_events` 1292 - The number of seconds where `outbound_pci_stalled_rd` was above 30%. 1293 - Informative 1294 1295 * - `outbound_pci_stalled_wr_events` 1296 - The number of seconds where `outbound_pci_stalled_wr` was above 30%. 1297 - Informative 1298 1299 * - `dev_out_of_buffer` 1300 - The number of times the device owned queue had not enough buffers 1301 allocated. 1302 - Error 1303