xref: /openbmc/linux/Documentation/networking/device_drivers/ethernet/mellanox/mlx5/counters.rst (revision 248ed9e227e6cf59acb1aaf3aa30d530a0232c1a)
1.. SPDX-License-Identifier: GPL-2.0 OR Linux-OpenIB
2.. include:: <isonum.txt>
3
4================
5Ethtool counters
6================
7
8:Copyright: |copy| 2023, NVIDIA CORPORATION & AFFILIATES. All rights reserved.
9
10Contents
11========
12
13- `Overview`_
14- `Groups`_
15- `Types`_
16- `Descriptions`_
17
18Overview
19========
20
21There are several counter groups based on where the counter is being counted. In
22addition, each group of counters may have different counter types.
23
24These counter groups are based on which component in a networking setup,
25illustrated below, that they describe::
26
27                                                  ----------------------------------------
28                                                  |                                      |
29    ----------------------------------------    ---------------------------------------- |
30    |              Hypervisor              |    |                  VM                  | |
31    |                                      |    |                                      | |
32    | -------------------  --------------- |    | -------------------  --------------- | |
33    | | Ethernet driver |  | RDMA driver | |    | | Ethernet driver |  | RDMA driver | | |
34    | -------------------  --------------- |    | -------------------  --------------- | |
35    |           |                 |        |    |           |                 |        | |
36    |           -------------------        |    |           -------------------        | |
37    |                   |                  |    |                   |                  |--
38    ----------------------------------------    ----------------------------------------
39                        |                                           |
40            -------------               -----------------------------
41            |                           |
42         ------                      ------ ------ ------         ------      ------      ------
43    -----| PF |----------------------| VF |-| VF |-| VF |-----  --| PF |--- --| PF |--- --| PF |---
44    |    ------                      ------ ------ ------    |  | ------  | | ------  | | ------  |
45    |                                                        |  |         | |         | |         |
46    |                                                        |  |         | |         | |         |
47    |                                                        |  |         | |         | |         |
48    | eSwitch                                                |  | eSwitch | | eSwitch | | eSwitch |
49    ----------------------------------------------------------  ----------- ----------- -----------
50               -------------------------------------------------------------------------------
51               |                                                                             |
52               |                                                                             |
53               | Uplink (no counters)                                                        |
54               -------------------------------------------------------------------------------
55                       ---------------------------------------------------------------
56                       |                                                             |
57                       |                                                             |
58                       | MPFS (no counters)                                          |
59                       ---------------------------------------------------------------
60                                                     |
61                                                     |
62                                                     | Port
63
64Groups
65======
66
67Ring
68  Software counters populated by the driver stack.
69
70Netdev
71  An aggregation of software ring counters.
72
73vPort counters
74  Traffic counters and drops due to steering or no buffers. May indicate issues
75  with NIC. These counters include Ethernet traffic counters (including Raw
76  Ethernet) and RDMA/RoCE traffic counters.
77
78Physical port counters
79  Counters that collect statistics about the PFs and VFs. May indicate issues
80  with NIC, link, or network. This measuring point holds information on
81  standardized counters like IEEE 802.3, RFC2863, RFC 2819, RFC 3635 and
82  additional counters like flow control, FEC and more. Physical port counters
83  are not exposed to virtual machines.
84
85Priority Port Counters
86  A set of the physical port counters, per priority per port.
87
88Types
89=====
90
91Counters are divided into three types.
92
93Traffic Informative Counters
94  Counters which count traffic. These counters can be used for load estimation
95  or for general debug.
96
97Traffic Acceleration Counters
98  Counters which count traffic that was accelerated by Mellanox driver or by
99  hardware. The counters are an additional layer to the informative counter set,
100  and the same traffic is counted in both informative and acceleration counters.
101
102.. [#accel] Traffic acceleration counter.
103
104Error Counters
105  Increment of these counters might indicate a problem. Each of these counters
106  has an explanation and correction action.
107
108Statistic can be fetched via the `ip link` or `ethtool` commands. `ethtool`
109provides more detailed information.::
110
111    ip –s link show <if-name>
112    ethtool -S <if-name>
113
114Descriptions
115============
116
117XSK, PTP, and QoS counters that are similar to counters defined previously will
118not be separately listed. For example, `ptp_tx[i]_packets` will not be
119explicitly documented since `tx[i]_packets` describes the behavior of both
120counters, except `ptp_tx[i]_packets` is only counted when precision time
121protocol is used.
122
123Ring / Netdev Counter
124----------------------------
125The following counters are available per ring or software port.
126
127These counters provide information on the amount of traffic that was accelerated
128by the NIC. The counters are counting the accelerated traffic in addition to the
129standard counters which counts it (i.e. accelerated traffic is counted twice).
130
131The counter names in the table below refers to both ring and port counters. The
132notation for ring counters includes the [i] index without the braces. The
133notation for port counters doesn't include the [i]. A counter name
134`rx[i]_packets` will be printed as `rx0_packets` for ring 0 and `rx_packets` for
135the software port.
136
137.. flat-table:: Ring / Software Port Counter Table
138   :widths: 2 3 1
139
140   * - Counter
141     - Description
142     - Type
143
144   * - `rx[i]_packets`
145     - The number of packets received on ring i.
146     - Informative
147
148   * - `rx[i]_bytes`
149     - The number of bytes received on ring i.
150     - Informative
151
152   * - `tx[i]_packets`
153     - The number of packets transmitted on ring i.
154     - Informative
155
156   * - `tx[i]_bytes`
157     - The number of bytes transmitted on ring i.
158     - Informative
159
160   * - `tx[i]_recover`
161     - The number of times the SQ was recovered.
162     - Error
163
164   * - `tx[i]_cqes`
165     - Number of CQEs events on SQ issued on ring i.
166     - Informative
167
168   * - `tx[i]_cqe_err`
169     - The number of error CQEs encountered on the SQ for ring i.
170     - Error
171
172   * - `tx[i]_tso_packets`
173     - The number of TSO packets transmitted on ring i [#accel]_.
174     - Acceleration
175
176   * - `tx[i]_tso_bytes`
177     - The number of TSO bytes transmitted on ring i [#accel]_.
178     - Acceleration
179
180   * - `tx[i]_tso_inner_packets`
181     - The number of TSO packets which are indicated to be carry internal
182       encapsulation transmitted on ring i [#accel]_.
183     - Acceleration
184
185   * - `tx[i]_tso_inner_bytes`
186     - The number of TSO bytes which are indicated to be carry internal
187       encapsulation transmitted on ring i [#accel]_.
188     - Acceleration
189
190   * - `rx[i]_gro_packets`
191     - Number of received packets processed using hardware-accelerated GRO. The
192       number of hardware GRO offloaded packets received on ring i.
193     - Acceleration
194
195   * - `rx[i]_gro_bytes`
196     - Number of received bytes processed using hardware-accelerated GRO. The
197       number of hardware GRO offloaded bytes received on ring i.
198     - Acceleration
199
200   * - `rx[i]_gro_skbs`
201     - The number of receive SKBs constructed while performing
202       hardware-accelerated GRO.
203     - Informative
204
205   * - `rx[i]_gro_match_packets`
206     - Number of received packets processed using hardware-accelerated GRO that
207       met the flow table match criteria.
208     - Informative
209
210   * - `rx[i]_gro_large_hds`
211     - Number of receive packets using hardware-accelerated GRO that have large
212       headers that require additional memory to be allocated.
213     - Informative
214
215   * - `rx[i]_lro_packets`
216     - The number of LRO packets received on ring i [#accel]_.
217     - Acceleration
218
219   * - `rx[i]_lro_bytes`
220     - The number of LRO bytes received on ring i [#accel]_.
221     - Acceleration
222
223   * - `rx[i]_ecn_mark`
224     - The number of received packets where the ECN mark was turned on.
225     - Informative
226
227   * - `rx_oversize_pkts_buffer`
228     - The number of dropped received packets due to length which arrived to RQ
229       and exceed software buffer size allocated by the device for incoming
230       traffic. It might imply that the device MTU is larger than the software
231       buffers size.
232     - Error
233
234   * - `rx_oversize_pkts_sw_drop`
235     - Number of received packets dropped in software because the CQE data is
236       larger than the MTU size.
237     - Error
238
239   * - `rx[i]_csum_unnecessary`
240     - Packets received with a `CHECKSUM_UNNECESSARY` on ring i [#accel]_.
241     - Acceleration
242
243   * - `rx[i]_csum_unnecessary_inner`
244     - Packets received with inner encapsulation with a `CHECKSUM_UNNECESSARY`
245       on ring i [#accel]_.
246     - Acceleration
247
248   * - `rx[i]_csum_none`
249     - Packets received with a `CHECKSUM_NONE` on ring i [#accel]_.
250     - Acceleration
251
252   * - `rx[i]_csum_complete`
253     - Packets received with a `CHECKSUM_COMPLETE` on ring i [#accel]_.
254     - Acceleration
255
256   * - `rx[i]_csum_complete_tail`
257     - Number of received packets that had checksum calculation computed,
258       potentially needed padding, and were able to do so with
259       `CHECKSUM_PARTIAL`.
260     - Informative
261
262   * - `rx[i]_csum_complete_tail_slow`
263     - Number of received packets that need padding larger than eight bytes for
264       the checksum.
265     - Informative
266
267   * - `tx[i]_csum_partial`
268     - Packets transmitted with a `CHECKSUM_PARTIAL` on ring i [#accel]_.
269     - Acceleration
270
271   * - `tx[i]_csum_partial_inner`
272     - Packets transmitted with inner encapsulation with a `CHECKSUM_PARTIAL` on
273       ring i [#accel]_.
274     - Acceleration
275
276   * - `tx[i]_csum_none`
277     - Packets transmitted with no hardware checksum acceleration on ring i.
278     - Informative
279
280   * - `tx[i]_stopped` / `tx_queue_stopped` [#ring_global]_
281     - Events where SQ was full on ring i. If this counter is increased, check
282       the amount of buffers allocated for transmission.
283     - Informative
284
285   * - `tx[i]_wake` / `tx_queue_wake` [#ring_global]_
286     - Events where SQ was full and has become not full on ring i.
287     - Informative
288
289   * - `tx[i]_dropped` / `tx_queue_dropped` [#ring_global]_
290     - Packets transmitted that were dropped due to DMA mapping failure on
291       ring i. If this counter is increased, check the amount of buffers
292       allocated for transmission.
293     - Error
294
295   * - `tx[i]_nop`
296     - The number of nop WQEs (empty WQEs) inserted to the SQ (related to
297       ring i) due to the reach of the end of the cyclic buffer. When reaching
298       near to the end of cyclic buffer the driver may add those empty WQEs to
299       avoid handling a state the a WQE start in the end of the queue and ends
300       in the beginning of the queue. This is a normal condition.
301     - Informative
302
303   * - `tx[i]_added_vlan_packets`
304     - The number of packets sent where vlan tag insertion was offloaded to the
305       hardware.
306     - Acceleration
307
308   * - `rx[i]_removed_vlan_packets`
309     - The number of packets received where vlan tag stripping was offloaded to
310       the hardware.
311     - Acceleration
312
313   * - `rx[i]_wqe_err`
314     - The number of wrong opcodes received on ring i.
315     - Error
316
317   * - `rx[i]_mpwqe_frag`
318     - The number of WQEs that failed to allocate compound page and hence
319       fragmented MPWQE’s (Multi Packet WQEs) were used on ring i. If this
320       counter raise, it may suggest that there is no enough memory for large
321       pages, the driver allocated fragmented pages. This is not abnormal
322       condition.
323     - Informative
324
325   * - `rx[i]_mpwqe_filler_cqes`
326     - The number of filler CQEs events that were issued on ring i.
327     - Informative
328
329   * - `rx[i]_mpwqe_filler_strides`
330     - The number of strides consumed by filler CQEs on ring i.
331     - Informative
332
333   * - `tx[i]_mpwqe_blks`
334     - The number of send blocks processed from Multi-Packet WQEs (mpwqe).
335     - Informative
336
337   * - `tx[i]_mpwqe_pkts`
338     - The number of send packets processed from Multi-Packet WQEs (mpwqe).
339     - Informative
340
341   * - `rx[i]_cqe_compress_blks`
342     - The number of receive blocks with CQE compression on ring i [#accel]_.
343     - Acceleration
344
345   * - `rx[i]_cqe_compress_pkts`
346     - The number of receive packets with CQE compression on ring i [#accel]_.
347     - Acceleration
348
349   * - `rx[i]_cache_reuse`
350     - The number of events of successful reuse of a page from a driver's
351       internal page cache.
352     - Acceleration
353
354   * - `rx[i]_cache_full`
355     - The number of events of full internal page cache where driver can't put a
356       page back to the cache for recycling (page will be freed).
357     - Acceleration
358
359   * - `rx[i]_cache_empty`
360     - The number of events where cache was empty - no page to give. Driver
361       shall allocate new page.
362     - Acceleration
363
364   * - `rx[i]_cache_busy`
365     - The number of events where cache head was busy and cannot be recycled.
366       Driver allocated new page.
367     - Acceleration
368
369   * - `rx[i]_cache_waive`
370     - The number of cache evacuation. This can occur due to page move to
371       another NUMA node or page was pfmemalloc-ed and should be freed as soon
372       as possible.
373     - Acceleration
374
375   * - `rx[i]_arfs_err`
376     - Number of flow rules that failed to be added to the flow table.
377     - Error
378
379   * - `rx[i]_recover`
380     - The number of times the RQ was recovered.
381     - Error
382
383   * - `tx[i]_xmit_more`
384     - The number of packets sent with `xmit_more` indication set on the skbuff
385       (no doorbell).
386     - Acceleration
387
388   * - `ch[i]_poll`
389     - The number of invocations of NAPI poll of channel i.
390     - Informative
391
392   * - `ch[i]_arm`
393     - The number of times the NAPI poll function completed and armed the
394       completion queues on channel i.
395     - Informative
396
397   * - `ch[i]_aff_change`
398     - The number of times the NAPI poll function explicitly stopped execution
399       on a CPU due to a change in affinity, on channel i.
400     - Informative
401
402   * - `ch[i]_events`
403     - The number of hard interrupt events on the completion queues of channel i.
404     - Informative
405
406   * - `ch[i]_eq_rearm`
407     - The number of times the EQ was recovered.
408     - Error
409
410   * - `ch[i]_force_irq`
411     - Number of times NAPI is triggered by XSK wakeups by posting a NOP to
412       ICOSQ.
413     - Acceleration
414
415   * - `rx[i]_congst_umr`
416     - The number of times an outstanding UMR request is delayed due to
417       congestion, on ring i.
418     - Informative
419
420   * - `rx_pp_alloc_fast`
421     - Number of successful fast path allocations.
422     - Informative
423
424   * - `rx_pp_alloc_slow`
425     - Number of slow path order-0 allocations.
426     - Informative
427
428   * - `rx_pp_alloc_slow_high_order`
429     - Number of slow path high order allocations.
430     - Informative
431
432   * - `rx_pp_alloc_empty`
433     - Counter is incremented when ptr ring is empty, so a slow path allocation
434       was forced.
435     - Informative
436
437   * - `rx_pp_alloc_refill`
438     - Counter is incremented when an allocation which triggered a refill of the
439       cache.
440     - Informative
441
442   * - `rx_pp_alloc_waive`
443     - Counter is incremented when pages obtained from the ptr ring that cannot
444       be added to the cache due to a NUMA mismatch.
445     - Informative
446
447   * - `rx_pp_recycle_cached`
448     - Counter is incremented when recycling placed page in the page pool cache.
449     - Informative
450
451   * - `rx_pp_recycle_cache_full`
452     - Counter is incremented when page pool cache was full.
453     - Informative
454
455   * - `rx_pp_recycle_ring`
456     - Counter is incremented when page placed into the ptr ring.
457     - Informative
458
459   * - `rx_pp_recycle_ring_full`
460     - Counter is incremented when page released from page pool because the ptr
461       ring was full.
462     - Informative
463
464   * - `rx_pp_recycle_released_ref`
465     - Counter is incremented when page released (and not recycled) because
466       refcnt > 1.
467     - Informative
468
469   * - `rx[i]_xsk_buff_alloc_err`
470     - The number of times allocating an skb or XSK buffer failed in the XSK RQ
471       context.
472     - Error
473
474   * - `rx[i]_xsk_arfs_err`
475     - aRFS (accelerated Receive Flow Steering) does not occur in the XSK RQ
476       context, so this counter should never increment.
477     - Error
478
479   * - `rx[i]_xdp_tx_xmit`
480     - The number of packets forwarded back to the port due to XDP program
481       `XDP_TX` action (bouncing). these packets are not counted by other
482       software counters. These packets are counted by physical port and vPort
483       counters.
484     - Informative
485
486   * - `rx[i]_xdp_tx_mpwqe`
487     - Number of multi-packet WQEs transmitted by the netdev and `XDP_TX`-ed by
488       the netdev during the RQ context.
489     - Acceleration
490
491   * - `rx[i]_xdp_tx_inlnw`
492     - Number of WQE data segments transmitted where the data could be inlined
493       in the WQE and then `XDP_TX`-ed during the RQ context.
494     - Acceleration
495
496   * - `rx[i]_xdp_tx_nops`
497     - Number of NOP WQEBBs (WQE building blocks) received posted to the XDP SQ.
498     - Acceleration
499
500   * - `rx[i]_xdp_tx_full`
501     - The number of packets that should have been forwarded back to the port
502       due to `XDP_TX` action but were dropped due to full tx queue. These packets
503       are not counted by other software counters. These packets are counted by
504       physical port and vPort counters. You may open more rx queues and spread
505       traffic rx over all queues and/or increase rx ring size.
506     - Error
507
508   * - `rx[i]_xdp_tx_err`
509     - The number of times an `XDP_TX` error such as frame too long and frame
510       too short occurred on `XDP_TX` ring of RX ring.
511     - Error
512
513   * - `rx[i]_xdp_tx_cqes` / `rx_xdp_tx_cqe` [#ring_global]_
514     - The number of completions received on the CQ of the `XDP_TX` ring.
515     - Informative
516
517   * - `rx[i]_xdp_drop`
518     - The number of packets dropped due to XDP program `XDP_DROP` action. these
519       packets are not counted by other software counters. These packets are
520       counted by physical port and vPort counters.
521     - Informative
522
523   * - `rx[i]_xdp_redirect`
524     - The number of times an XDP redirect action was triggered on ring i.
525     - Acceleration
526
527   * - `tx[i]_xdp_xmit`
528     - The number of packets redirected to the interface(due to XDP redirect).
529       These packets are not counted by other software counters. These packets
530       are counted by physical port and vPort counters.
531     - Informative
532
533   * - `tx[i]_xdp_full`
534     - The number of packets redirected to the interface(due to XDP redirect),
535       but were dropped due to full tx queue. these packets are not counted by
536       other software counters. you may enlarge tx queues.
537     - Informative
538
539   * - `tx[i]_xdp_mpwqe`
540     - Number of multi-packet WQEs offloaded onto the NIC that were
541       `XDP_REDIRECT`-ed from other netdevs.
542     - Acceleration
543
544   * - `tx[i]_xdp_inlnw`
545     - Number of WQE data segments where the data could be inlined in the WQE
546       where the data segments were `XDP_REDIRECT`-ed from other netdevs.
547     - Acceleration
548
549   * - `tx[i]_xdp_nops`
550     - Number of NOP WQEBBs (WQE building blocks) posted to the SQ that were
551       `XDP_REDIRECT`-ed from other netdevs.
552     - Acceleration
553
554   * - `tx[i]_xdp_err`
555     - The number of packets redirected to the interface(due to XDP redirect)
556       but were dropped due to error such as frame too long and frame too short.
557     - Error
558
559   * - `tx[i]_xdp_cqes`
560     - The number of completions received for packets redirected to the
561       interface(due to XDP redirect) on the CQ.
562     - Informative
563
564   * - `tx[i]_xsk_xmit`
565     - The number of packets transmitted using XSK zerocopy functionality.
566     - Acceleration
567
568   * - `tx[i]_xsk_mpwqe`
569     - Number of multi-packet WQEs offloaded onto the NIC that were
570       `XDP_REDIRECT`-ed from other netdevs.
571     - Acceleration
572
573   * - `tx[i]_xsk_inlnw`
574     - Number of WQE data segments where the data could be inlined in the WQE
575       that are transmitted using XSK zerocopy.
576     - Acceleration
577
578   * - `tx[i]_xsk_full`
579     - Number of times doorbell is rung in XSK zerocopy mode when SQ is full.
580     - Error
581
582   * - `tx[i]_xsk_err`
583     - Number of errors that occurred in XSK zerocopy mode such as if the data
584       size is larger than the MTU size.
585     - Error
586
587   * - `tx[i]_xsk_cqes`
588     - Number of CQEs processed in XSK zerocopy mode.
589     - Acceleration
590
591   * - `tx_tls_ctx`
592     - Number of TLS TX HW offload contexts added to device for encryption.
593     - Acceleration
594
595   * - `tx_tls_del`
596     - Number of TLS TX HW offload contexts removed from device (connection
597       closed).
598     - Acceleration
599
600   * - `tx_tls_pool_alloc`
601     - Number of times a unit of work is successfully allocated in the TLS HW
602       offload pool.
603     - Acceleration
604
605   * - `tx_tls_pool_free`
606     - Number of times a unit of work is freed in the TLS HW offload pool.
607     - Acceleration
608
609   * - `rx_tls_ctx`
610     - Number of TLS RX HW offload contexts added to device for decryption.
611     - Acceleration
612
613   * - `rx_tls_del`
614     - Number of TLS RX HW offload contexts deleted from device (connection has
615       finished).
616     - Acceleration
617
618   * - `rx[i]_tls_decrypted_packets`
619     - Number of successfully decrypted RX packets which were part of a TLS
620       stream.
621     - Acceleration
622
623   * - `rx[i]_tls_decrypted_bytes`
624     - Number of TLS payload bytes in RX packets which were successfully
625       decrypted.
626     - Acceleration
627
628   * - `rx[i]_tls_resync_req_pkt`
629     - Number of received TLS packets with a resync request.
630     - Acceleration
631
632   * - `rx[i]_tls_resync_req_start`
633     - Number of times the TLS async resync request was started.
634     - Acceleration
635
636   * - `rx[i]_tls_resync_req_end`
637     - Number of times the TLS async resync request properly ended with
638       providing the HW tracked tcp-seq.
639     - Acceleration
640
641   * - `rx[i]_tls_resync_req_skip`
642     - Number of times the TLS async resync request procedure was started but
643       not properly ended.
644     - Error
645
646   * - `rx[i]_tls_resync_res_ok`
647     - Number of times the TLS resync response call to the driver was
648       successfully handled.
649     - Acceleration
650
651   * - `rx[i]_tls_resync_res_retry`
652     - Number of times the TLS resync response call to the driver was
653       reattempted when ICOSQ is full.
654     - Error
655
656   * - `rx[i]_tls_resync_res_skip`
657     - Number of times the TLS resync response call to the driver was terminated
658       unsuccessfully.
659     - Error
660
661   * - `rx[i]_tls_err`
662     - Number of times when CQE TLS offload was problematic.
663     - Error
664
665   * - `tx[i]_tls_encrypted_packets`
666     - The number of send packets that are TLS encrypted by the kernel.
667     - Acceleration
668
669   * - `tx[i]_tls_encrypted_bytes`
670     - The number of send bytes that are TLS encrypted by the kernel.
671     - Acceleration
672
673   * - `tx[i]_tls_ooo`
674     - Number of times out of order TLS SQE fragments were handled on ring i.
675     - Acceleration
676
677   * - `tx[i]_tls_dump_packets`
678     - Number of TLS decrypted packets copied over from NIC over DMA.
679     - Acceleration
680
681   * - `tx[i]_tls_dump_bytes`
682     - Number of TLS decrypted bytes copied over from NIC over DMA.
683     - Acceleration
684
685   * - `tx[i]_tls_resync_bytes`
686     - Number of TLS bytes requested to be resynchronized in order to be
687       decrypted.
688     - Acceleration
689
690   * - `tx[i]_tls_skip_no_sync_data`
691     - Number of TLS send data that can safely be skipped / do not need to be
692       decrypted.
693     - Acceleration
694
695   * - `tx[i]_tls_drop_no_sync_data`
696     - Number of TLS send data that were dropped due to retransmission of TLS
697       data.
698     - Acceleration
699
700   * - `ptp_cq[i]_abort`
701     - Number of times a CQE has to be skipped in precision time protocol due to
702       a skew between the port timestamp and CQE timestamp being greater than
703       128 seconds.
704     - Error
705
706   * - `ptp_cq[i]_abort_abs_diff_ns`
707     - Accumulation of time differences between the port timestamp and CQE
708       timestamp when the difference is greater than 128 seconds in precision
709       time protocol.
710     - Error
711
712.. [#ring_global] The corresponding ring and global counters do not share the
713                  same name (i.e. do not follow the common naming scheme).
714
715vPort Counters
716--------------
717Counters on the NIC port that is connected to a eSwitch.
718
719.. flat-table:: vPort Counter Table
720   :widths: 2 3 1
721
722   * - Counter
723     - Description
724     - Type
725
726   * - `rx_vport_unicast_packets`
727     - Unicast packets received, steered to a port including Raw Ethernet
728       QP/DPDK traffic, excluding RDMA traffic.
729     - Informative
730
731   * - `rx_vport_unicast_bytes`
732     - Unicast bytes received, steered to a port including Raw Ethernet QP/DPDK
733       traffic, excluding RDMA traffic.
734     - Informative
735
736   * - `tx_vport_unicast_packets`
737     - Unicast packets transmitted, steered from a port including Raw Ethernet
738       QP/DPDK traffic, excluding RDMA traffic.
739     - Informative
740
741   * - `tx_vport_unicast_bytes`
742     - Unicast bytes transmitted, steered from a port including Raw Ethernet
743       QP/DPDK traffic, excluding RDMA traffic.
744     - Informative
745
746   * - `rx_vport_multicast_packets`
747     - Multicast packets received, steered to a port including Raw Ethernet
748       QP/DPDK traffic, excluding RDMA traffic.
749     - Informative
750
751   * - `rx_vport_multicast_bytes`
752     - Multicast bytes received, steered to a port including Raw Ethernet
753       QP/DPDK traffic, excluding RDMA traffic.
754     - Informative
755
756   * - `tx_vport_multicast_packets`
757     - Multicast packets transmitted, steered from a port including Raw Ethernet
758       QP/DPDK traffic, excluding RDMA traffic.
759     - Informative
760
761   * - `tx_vport_multicast_bytes`
762     - Multicast bytes transmitted, steered from a port including Raw Ethernet
763       QP/DPDK traffic, excluding RDMA traffic.
764     - Informative
765
766   * - `rx_vport_broadcast_packets`
767     - Broadcast packets received, steered to a port including Raw Ethernet
768       QP/DPDK traffic, excluding RDMA traffic.
769     - Informative
770
771   * - `rx_vport_broadcast_bytes`
772     - Broadcast bytes received, steered to a port including Raw Ethernet
773       QP/DPDK traffic, excluding RDMA traffic.
774     - Informative
775
776   * - `tx_vport_broadcast_packets`
777     - Broadcast packets transmitted, steered from a port including Raw Ethernet
778       QP/DPDK traffic, excluding RDMA traffic.
779     - Informative
780
781   * - `tx_vport_broadcast_bytes`
782     - Broadcast bytes transmitted, steered from a port including Raw Ethernet
783       QP/DPDK traffic, excluding RDMA traffic.
784     - Informative
785
786   * - `rx_vport_rdma_unicast_packets`
787     - RDMA unicast packets received, steered to a port (counters counts
788       RoCE/UD/RC traffic) [#accel]_.
789     - Acceleration
790
791   * - `rx_vport_rdma_unicast_bytes`
792     - RDMA unicast bytes received, steered to a port (counters counts
793       RoCE/UD/RC traffic) [#accel]_.
794     - Acceleration
795
796   * - `tx_vport_rdma_unicast_packets`
797     - RDMA unicast packets transmitted, steered from a port (counters counts
798       RoCE/UD/RC traffic) [#accel]_.
799     - Acceleration
800
801   * - `tx_vport_rdma_unicast_bytes`
802     - RDMA unicast bytes transmitted, steered from a port (counters counts
803       RoCE/UD/RC traffic) [#accel]_.
804     - Acceleration
805
806   * - `rx_vport_rdma_multicast_packets`
807     - RDMA multicast packets received, steered to a port (counters counts
808       RoCE/UD/RC traffic) [#accel]_.
809     - Acceleration
810
811   * - `rx_vport_rdma_multicast_bytes`
812     - RDMA multicast bytes received, steered to a port (counters counts
813       RoCE/UD/RC traffic) [#accel]_.
814     - Acceleration
815
816   * - `tx_vport_rdma_multicast_packets`
817     - RDMA multicast packets transmitted, steered from a port (counters counts
818       RoCE/UD/RC traffic) [#accel]_.
819     - Acceleration
820
821   * - `tx_vport_rdma_multicast_bytes`
822     - RDMA multicast bytes transmitted, steered from a port (counters counts
823       RoCE/UD/RC traffic) [#accel]_.
824     - Acceleration
825
826   * - `rx_steer_missed_packets`
827     - Number of packets that was received by the NIC, however was discarded
828       because it did not match any flow in the NIC flow table.
829     - Error
830
831   * - `rx_packets`
832     - Representor only: packets received, that were handled by the hypervisor.
833     - Informative
834
835   * - `rx_bytes`
836     - Representor only: bytes received, that were handled by the hypervisor.
837     - Informative
838
839   * - `tx_packets`
840     - Representor only: packets transmitted, that were handled by the
841       hypervisor.
842     - Informative
843
844   * - `tx_bytes`
845     - Representor only: bytes transmitted, that were handled by the hypervisor.
846     - Informative
847
848   * - `dev_internal_queue_oob`
849     - The number of dropped packets due to lack of receive WQEs for an internal
850       device RQ.
851     - Error
852
853Physical Port Counters
854----------------------
855The physical port counters are the counters on the external port connecting the
856adapter to the network. This measuring point holds information on standardized
857counters like IEEE 802.3, RFC2863, RFC 2819, RFC 3635 and additional counters
858like flow control, FEC and more.
859
860.. flat-table:: Physical Port Counter Table
861   :widths: 2 3 1
862
863   * - Counter
864     - Description
865     - Type
866
867   * - `rx_packets_phy`
868     - The number of packets received on the physical port. This counter doesn’t
869       include packets that were discarded due to FCS, frame size and similar
870       errors.
871     - Informative
872
873   * - `tx_packets_phy`
874     - The number of packets transmitted on the physical port.
875     - Informative
876
877   * - `rx_bytes_phy`
878     - The number of bytes received on the physical port, including Ethernet
879       header and FCS.
880     - Informative
881
882   * - `tx_bytes_phy`
883     - The number of bytes transmitted on the physical port.
884     - Informative
885
886   * - `rx_multicast_phy`
887     - The number of multicast packets received on the physical port.
888     - Informative
889
890   * - `tx_multicast_phy`
891     - The number of multicast packets transmitted on the physical port.
892     - Informative
893
894   * - `rx_broadcast_phy`
895     - The number of broadcast packets received on the physical port.
896     - Informative
897
898   * - `tx_broadcast_phy`
899     - The number of broadcast packets transmitted on the physical port.
900     - Informative
901
902   * - `rx_crc_errors_phy`
903     - The number of dropped received packets due to FCS (Frame Check Sequence)
904       error on the physical port. If this counter is increased in high rate,
905       check the link quality using `rx_symbol_error_phy` and
906       `rx_corrected_bits_phy` counters below.
907     - Error
908
909   * - `rx_in_range_len_errors_phy`
910     - The number of received packets dropped due to length/type errors on a
911       physical port.
912     - Error
913
914   * - `rx_out_of_range_len_phy`
915     - The number of received packets dropped due to length greater than allowed
916       on a physical port. If this counter is increasing, it implies that the
917       peer connected to the adapter has a larger MTU configured. Using same MTU
918       configuration shall resolve this issue.
919     - Error
920
921   * - `rx_oversize_pkts_phy`
922     - The number of dropped received packets due to length which exceed MTU
923       size on a physical port. If this counter is increasing, it implies that
924       the peer connected to the adapter has a larger MTU configured. Using same
925       MTU configuration shall resolve this issue.
926     - Error
927
928   * - `rx_symbol_err_phy`
929     - The number of received packets dropped due to physical coding errors
930       (symbol errors) on a physical port.
931     - Error
932
933   * - `rx_mac_control_phy`
934     - The number of MAC control packets received on the physical port.
935     - Informative
936
937   * - `tx_mac_control_phy`
938     - The number of MAC control packets transmitted on the physical port.
939     - Informative
940
941   * - `rx_pause_ctrl_phy`
942     - The number of link layer pause packets received on a physical port. If
943       this counter is increasing, it implies that the network is congested and
944       cannot absorb the traffic coming from to the adapter.
945     - Informative
946
947   * - `tx_pause_ctrl_phy`
948     - The number of link layer pause packets transmitted on a physical port. If
949       this counter is increasing, it implies that the NIC is congested and
950       cannot absorb the traffic coming from the network.
951     - Informative
952
953   * - `rx_unsupported_op_phy`
954     - The number of MAC control packets received with unsupported opcode on a
955       physical port.
956     - Error
957
958   * - `rx_discards_phy`
959     - The number of received packets dropped due to lack of buffers on a
960       physical port. If this counter is increasing, it implies that the adapter
961       is congested and cannot absorb the traffic coming from the network.
962     - Error
963
964   * - `tx_discards_phy`
965     - The number of packets which were discarded on transmission, even no
966       errors were detected. the drop might occur due to link in down state,
967       head of line drop, pause from the network, etc.
968     - Error
969
970   * - `tx_errors_phy`
971     - The number of transmitted packets dropped due to a length which exceed
972       MTU size on a physical port.
973     - Error
974
975   * - `rx_undersize_pkts_phy`
976     - The number of received packets dropped due to length which is shorter
977       than 64 bytes on a physical port. If this counter is increasing, it
978       implies that the peer connected to the adapter has a non-standard MTU
979       configured or malformed packet had arrived.
980     - Error
981
982   * - `rx_fragments_phy`
983     - The number of received packets dropped due to a length which is shorter
984       than 64 bytes and has FCS error on a physical port. If this counter is
985       increasing, it implies that the peer connected to the adapter has a
986       non-standard MTU configured.
987     - Error
988
989   * - `rx_jabbers_phy`
990     - The number of received packets d due to a length which is longer than 64
991       bytes and had FCS error on a physical port.
992     - Error
993
994   * - `rx_64_bytes_phy`
995     - The number of packets received on the physical port with size of 64 bytes.
996     - Informative
997
998   * - `rx_65_to_127_bytes_phy`
999     - The number of packets received on the physical port with size of 65 to
1000       127 bytes.
1001     - Informative
1002
1003   * - `rx_128_to_255_bytes_phy`
1004     - The number of packets received on the physical port with size of 128 to
1005       255 bytes.
1006     - Informative
1007
1008   * - `rx_256_to_511_bytes_phy`
1009     - The number of packets received on the physical port with size of 256 to
1010       512 bytes.
1011     - Informative
1012
1013   * - `rx_512_to_1023_bytes_phy`
1014     - The number of packets received on the physical port with size of 512 to
1015       1023 bytes.
1016     - Informative
1017
1018   * - `rx_1024_to_1518_bytes_phy`
1019     - The number of packets received on the physical port with size of 1024 to
1020       1518 bytes.
1021     - Informative
1022
1023   * - `rx_1519_to_2047_bytes_phy`
1024     - The number of packets received on the physical port with size of 1519 to
1025       2047 bytes.
1026     - Informative
1027
1028   * - `rx_2048_to_4095_bytes_phy`
1029     - The number of packets received on the physical port with size of 2048 to
1030       4095 bytes.
1031     - Informative
1032
1033   * - `rx_4096_to_8191_bytes_phy`
1034     - The number of packets received on the physical port with size of 4096 to
1035       8191 bytes.
1036     - Informative
1037
1038   * - `rx_8192_to_10239_bytes_phy`
1039     - The number of packets received on the physical port with size of 8192 to
1040       10239 bytes.
1041     - Informative
1042
1043   * - `link_down_events_phy`
1044     - The number of times where the link operative state changed to down. In
1045       case this counter is increasing it may imply on port flapping. You may
1046       need to replace the cable/transceiver.
1047     - Error
1048
1049   * - `rx_out_of_buffer`
1050     - Number of times receive queue had no software buffers allocated for the
1051       adapter's incoming traffic.
1052     - Error
1053
1054   * - `module_bus_stuck`
1055     - The number of times that module's I\ :sup:`2`\C bus (data or clock)
1056       short-wire was detected. You may need to replace the cable/transceiver.
1057     - Error
1058
1059   * - `module_high_temp`
1060     - The number of times that the module temperature was too high. If this
1061       issue persist, you may need to check the ambient temperature or replace
1062       the cable/transceiver module.
1063     - Error
1064
1065   * - `module_bad_shorted`
1066     - The number of times that the module cables were shorted. You may need to
1067       replace the cable/transceiver module.
1068     - Error
1069
1070   * - `module_unplug`
1071     - The number of times that module was ejected.
1072     - Informative
1073
1074   * - `rx_buffer_passed_thres_phy`
1075     - The number of events where the port receive buffer was over 85% full.
1076     - Informative
1077
1078   * - `tx_pause_storm_warning_events`
1079     - The number of times the device was sending pauses for a long period of
1080       time.
1081     - Informative
1082
1083   * - `tx_pause_storm_error_events`
1084     - The number of times the device was sending pauses for a long period of
1085       time, reaching time out and disabling transmission of pause frames. on
1086       the period where pause frames were disabled, drop could have been
1087       occurred.
1088     - Error
1089
1090   * - `rx[i]_buff_alloc_err`
1091     - Failed to allocate a buffer to received packet (or SKB) on ring i.
1092     - Error
1093
1094   * - `rx_bits_phy`
1095     - This counter provides information on the total amount of traffic that
1096       could have been received and can be used as a guideline to measure the
1097       ratio of errored traffic in `rx_pcs_symbol_err_phy` and
1098       `rx_corrected_bits_phy`.
1099     - Informative
1100
1101   * - `rx_pcs_symbol_err_phy`
1102     - This counter counts the number of symbol errors that wasn’t corrected by
1103       FEC correction algorithm or that FEC algorithm was not active on this
1104       interface. If this counter is increasing, it implies that the link
1105       between the NIC and the network is suffering from high BER, and that
1106       traffic is lost. You may need to replace the cable/transceiver. The error
1107       rate is the number of `rx_pcs_symbol_err_phy` divided by the number of
1108       `rx_bits_phy` on a specific time frame.
1109     - Error
1110
1111   * - `rx_corrected_bits_phy`
1112     - The number of corrected bits on this port according to active FEC
1113       (RS/FC). If this counter is increasing, it implies that the link between
1114       the NIC and the network is suffering from high BER. The corrected bit
1115       rate is the number of `rx_corrected_bits_phy` divided by the number of
1116       `rx_bits_phy` on a specific time frame.
1117     - Error
1118
1119   * - `rx_err_lane_[l]_phy`
1120     - This counter counts the number of physical raw errors per lane l index.
1121       The counter counts errors before FEC corrections. If this counter is
1122       increasing, it implies that the link between the NIC and the network is
1123       suffering from high BER, and that traffic might be lost. You may need to
1124       replace the cable/transceiver. Please check in accordance with
1125       `rx_corrected_bits_phy`.
1126     - Error
1127
1128   * - `rx_global_pause`
1129     - The number of pause packets received on the physical port. If this
1130       counter is increasing, it implies that the network is congested and
1131       cannot absorb the traffic coming from the adapter. Note: This counter is
1132       only enabled when global pause mode is enabled.
1133     - Informative
1134
1135   * - `rx_global_pause_duration`
1136     - The duration of pause received (in microSec) on the physical port. The
1137       counter represents the time the port did not send any traffic. If this
1138       counter is increasing, it implies that the network is congested and
1139       cannot absorb the traffic coming from the adapter. Note: This counter is
1140       only enabled when global pause mode is enabled.
1141     - Informative
1142
1143   * - `tx_global_pause`
1144     - The number of pause packets transmitted on a physical port. If this
1145       counter is increasing, it implies that the adapter is congested and
1146       cannot absorb the traffic coming from the network. Note: This counter is
1147       only enabled when global pause mode is enabled.
1148     - Informative
1149
1150   * - `tx_global_pause_duration`
1151     - The duration of pause transmitter (in microSec) on the physical port.
1152       Note: This counter is only enabled when global pause mode is enabled.
1153     - Informative
1154
1155   * - `rx_global_pause_transition`
1156     - The number of times a transition from Xoff to Xon on the physical port
1157       has occurred. Note: This counter is only enabled when global pause mode
1158       is enabled.
1159     - Informative
1160
1161   * - `rx_if_down_packets`
1162     - The number of received packets that were dropped due to interface down.
1163     - Informative
1164
1165Priority Port Counters
1166----------------------
1167The following counters are physical port counters that are counted per L2
1168priority (0-7).
1169
1170**Note:** `p` in the counter name represents the priority.
1171
1172.. flat-table:: Priority Port Counter Table
1173   :widths: 2 3 1
1174
1175   * - Counter
1176     - Description
1177     - Type
1178
1179   * - `rx_prio[p]_bytes`
1180     - The number of bytes received with priority p on the physical port.
1181     - Informative
1182
1183   * - `rx_prio[p]_packets`
1184     - The number of packets received with priority p on the physical port.
1185     - Informative
1186
1187   * - `tx_prio[p]_bytes`
1188     - The number of bytes transmitted on priority p on the physical port.
1189     - Informative
1190
1191   * - `tx_prio[p]_packets`
1192     - The number of packets transmitted on priority p on the physical port.
1193     - Informative
1194
1195   * - `rx_prio[p]_pause`
1196     - The number of pause packets received with priority p on a physical port.
1197       If this counter is increasing, it implies that the network is congested
1198       and cannot absorb the traffic coming from the adapter. Note: This counter
1199       is available only if PFC was enabled on priority p.
1200     - Informative
1201
1202   * - `rx_prio[p]_pause_duration`
1203     - The duration of pause received (in microSec) on priority p on the
1204       physical port. The counter represents the time the port did not send any
1205       traffic on this priority. If this counter is increasing, it implies that
1206       the network is congested and cannot absorb the traffic coming from the
1207       adapter. Note: This counter is available only if PFC was enabled on
1208       priority p.
1209     - Informative
1210
1211   * - `rx_prio[p]_pause_transition`
1212     - The number of times a transition from Xoff to Xon on priority p on the
1213       physical port has occurred. Note: This counter is available only if PFC
1214       was enabled on priority p.
1215     - Informative
1216
1217   * - `tx_prio[p]_pause`
1218     - The number of pause packets transmitted on priority p on a physical port.
1219       If this counter is increasing, it implies that the adapter is congested
1220       and cannot absorb the traffic coming from the network. Note: This counter
1221       is available only if PFC was enabled on priority p.
1222     - Informative
1223
1224   * - `tx_prio[p]_pause_duration`
1225     - The duration of pause transmitter (in microSec) on priority p on the
1226       physical port. Note: This counter is available only if PFC was enabled on
1227       priority p.
1228     - Informative
1229
1230   * - `rx_prio[p]_buf_discard`
1231     - The number of packets discarded by device due to lack of per host receive
1232       buffers.
1233     - Informative
1234
1235   * - `rx_prio[p]_cong_discard`
1236     - The number of packets discarded by device due to per host congestion.
1237     - Informative
1238
1239   * - `rx_prio[p]_marked`
1240     - The number of packets ecn marked by device due to per host congestion.
1241     - Informative
1242
1243   * - `rx_prio[p]_discards`
1244     - The number of packets discarded by device due to lack of receive buffers.
1245     - Informative
1246
1247Device Counters
1248---------------
1249.. flat-table:: Device Counter Table
1250   :widths: 2 3 1
1251
1252   * - Counter
1253     - Description
1254     - Type
1255
1256   * - `rx_pci_signal_integrity`
1257     - Counts physical layer PCIe signal integrity errors, the number of
1258       transitions to recovery due to Framing errors and CRC (dlp and tlp). If
1259       this counter is raising, try moving the adapter card to a different slot
1260       to rule out a bad PCI slot. Validate that you are running with the latest
1261       firmware available and latest server BIOS version.
1262     - Error
1263
1264   * - `tx_pci_signal_integrity`
1265     - Counts physical layer PCIe signal integrity errors, the number of
1266       transition to recovery initiated by the other side (moving to recovery
1267       due to getting TS/EIEOS). If this counter is raising, try moving the
1268       adapter card to a different slot to rule out a bad PCI slot. Validate
1269       that you are running with the latest firmware available and latest server
1270       BIOS version.
1271     - Error
1272
1273   * - `outbound_pci_buffer_overflow`
1274     - The number of packets dropped due to pci buffer overflow. If this counter
1275       is raising in high rate, it might indicate that the receive traffic rate
1276       for a host is larger than the PCIe bus and therefore a congestion occurs.
1277     - Informative
1278
1279   * - `outbound_pci_stalled_rd`
1280     - The percentage (in the range 0...100) of time within the last second that
1281       the NIC had outbound non-posted reads requests but could not perform the
1282       operation due to insufficient posted credits.
1283     - Informative
1284
1285   * - `outbound_pci_stalled_wr`
1286     - The percentage (in the range 0...100) of time within the last second that
1287       the NIC had outbound posted writes requests but could not perform the
1288       operation due to insufficient posted credits.
1289     - Informative
1290
1291   * - `outbound_pci_stalled_rd_events`
1292     - The number of seconds where `outbound_pci_stalled_rd` was above 30%.
1293     - Informative
1294
1295   * - `outbound_pci_stalled_wr_events`
1296     - The number of seconds where `outbound_pci_stalled_wr` was above 30%.
1297     - Informative
1298
1299   * - `dev_out_of_buffer`
1300     - The number of times the device owned queue had not enough buffers
1301       allocated.
1302     - Error
1303