Revision tags: v6.6.25, v6.6.24, v6.6.23, v6.6.16, v6.6.15, v6.6.14, v6.6.13, v6.6.12, v6.6.11, v6.6.10, v6.6.9, v6.6.8, v6.6.7, v6.6.6, v6.6.5, v6.6.4, v6.6.3, v6.6.2, v6.5.11, v6.6.1, v6.5.10, v6.6, v6.5.9, v6.5.8, v6.5.7, v6.5.6, v6.5.5, v6.5.4, v6.5.3, v6.5.2, v6.1.51, v6.5.1, v6.1.50, v6.5, v6.1.49, v6.1.48, v6.1.46, v6.1.45, v6.1.44, v6.1.43 |
|
#
34a79876 |
| 01-Aug-2023 |
Dragos Tatulea <dtatulea@nvidia.com> |
net/mlx5e: XDP, Fix fifo overrun on XDP_REDIRECT
Before this fix, running high rate traffic through XDP_REDIRECT with multibuf could overrun the fifo used to release the xdp frames after tx completi
net/mlx5e: XDP, Fix fifo overrun on XDP_REDIRECT
Before this fix, running high rate traffic through XDP_REDIRECT with multibuf could overrun the fifo used to release the xdp frames after tx completion. This resulted in corrupted data being consumed on the free side.
The culplirt was a miscalculation of the fifo size: the maximum ratio between fifo entries / data segments was incorrect. This ratio serves to calculate the max fifo size for a full sq where each packet uses the worst case number of entries in the fifo.
This patch fixes the formula and names the constant. It also makes sure that future values will use a power of 2 number of entries for the fifo mask to work.
Signed-off-by: Dragos Tatulea <dtatulea@nvidia.com> Fixes: 3f734b8c594b ("net/mlx5e: XDP, Use multiple single-entry objects in xdpi_fifo") Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
show more ...
|
Revision tags: v6.1.42, v6.1.41, v6.1.40, v6.1.39, v6.1.38, v6.1.37, v6.1.36, v6.4, v6.1.35, v6.1.34, v6.1.33, v6.1.32, v6.1.31, v6.1.30, v6.1.29, v6.1.28, v6.1.27, v6.1.26, v6.3, v6.1.25 |
|
#
3f734b8c |
| 17-Apr-2023 |
Tariq Toukan <tariqt@nvidia.com> |
net/mlx5e: XDP, Use multiple single-entry objects in xdpi_fifo
Here we fix the current wi->num_pkts abuse, as it was used to indicate multiple xdpi entries in the xdpi_fifo.
Instead, reduce mlx5e_x
net/mlx5e: XDP, Use multiple single-entry objects in xdpi_fifo
Here we fix the current wi->num_pkts abuse, as it was used to indicate multiple xdpi entries in the xdpi_fifo.
Instead, reduce mlx5e_xdp_info to the size of a single field, making it a union of unions. Per packet, use as many instances as needed to provide the information needed at the time of completion.
The sequence of xdpi instances pushed is well defined, derived by the xmit_mode.
Reviewed-by: Saeed Mahameed <saeedm@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
show more ...
|
#
eb9b9fdc |
| 17-Apr-2023 |
Tariq Toukan <tariqt@nvidia.com> |
net/mlx5e: Introduce extended version for mlx5e_xmit_data
Introduce struct mlx5e_xmit_data_frags to be used for non-linear xmit buffers. Let it include sinfo pointer.
Take one bit from the len fiel
net/mlx5e: Introduce extended version for mlx5e_xmit_data
Introduce struct mlx5e_xmit_data_frags to be used for non-linear xmit buffers. Let it include sinfo pointer.
Take one bit from the len field to indicate if the descriptor has fragments and can be casted-up into the extended version.
Zero-init to make sure has_frags, and potentially future fields, are zero when not explicitly assigned.
Another field will be added in a downstream patch to indicate and point to dma addresses of the different frags, for redirect-in requests.
This simplifies the mlx5e_xmit_xdp_frame/mlx5e_xmit_xdp_frame_mpwqe functions params.
Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
show more ...
|
#
aebc62d3 |
| 17-Apr-2023 |
Tariq Toukan <tariqt@nvidia.com> |
net/mlx5e: Move XDP struct and enum to XDP header
Move struct mlx5e_xdp_info and enum mlx5e_xdp_xmit_mode from the generic en.h to the XDP header, where they belong.
Reviewed-by: Saeed Mahameed <sa
net/mlx5e: Move XDP struct and enum to XDP header
Move struct mlx5e_xdp_info and enum mlx5e_xdp_xmit_mode from the generic en.h to the XDP header, where they belong.
Reviewed-by: Saeed Mahameed <saeedm@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
show more ...
|
Revision tags: v6.1.24, v6.1.23, v6.1.22, v6.1.21, v6.1.20, v6.1.19, v6.1.18, v6.1.17, v6.1.16, v6.1.15, v6.1.14, v6.1.13, v6.2 |
|
#
9da5294e |
| 15-Feb-2023 |
Tariq Toukan <tariqt@nvidia.com> |
net/mlx5e: Remove redundant page argument in mlx5e_xdp_handle()
Remove the page parameter, it can be derived from the xdp_buff member of mlx5e_xdp_buff.
Signed-off-by: Tariq Toukan <tariqt@nvidia.c
net/mlx5e: Remove redundant page argument in mlx5e_xdp_handle()
Remove the page parameter, it can be derived from the xdp_buff member of mlx5e_xdp_buff.
Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Reviewed-by: Dragos Tatulea <dtatulea@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
show more ...
|
Revision tags: v6.1.12, v6.1.11, v6.1.10, v6.1.9, v6.1.8 |
|
#
bc8d405b |
| 19-Jan-2023 |
Toke Høiland-Jørgensen <toke@redhat.com> |
net/mlx5e: Support RX XDP metadata
Support RX hash and timestamp metadata kfuncs. We need to pass in the cqe pointer to the mlx5e_skb_from* functions so it can be retrieved from the XDP ctx to do th
net/mlx5e: Support RX XDP metadata
Support RX hash and timestamp metadata kfuncs. We need to pass in the cqe pointer to the mlx5e_skb_from* functions so it can be retrieved from the XDP ctx to do this.
Cc: Tariq Toukan <tariqt@nvidia.com> Cc: Saeed Mahameed <saeedm@nvidia.com> Cc: John Fastabend <john.fastabend@gmail.com> Cc: David Ahern <dsahern@gmail.com> Cc: Martin KaFai Lau <martin.lau@linux.dev> Cc: Jakub Kicinski <kuba@kernel.org> Cc: Willem de Bruijn <willemb@google.com> Cc: Jesper Dangaard Brouer <brouer@redhat.com> Cc: Anatoly Burakov <anatoly.burakov@intel.com> Cc: Alexander Lobakin <alexandr.lobakin@intel.com> Cc: Magnus Karlsson <magnus.karlsson@gmail.com> Cc: Maryam Tahhan <mtahhan@redhat.com> Cc: xdp-hints@xdp-project.net Cc: netdev@vger.kernel.org Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com> Signed-off-by: Stanislav Fomichev <sdf@google.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Link: https://lore.kernel.org/r/20230119221536.3349901-17-sdf@google.com Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
show more ...
|
#
384a13ca |
| 19-Jan-2023 |
Toke Høiland-Jørgensen <toke@redhat.com> |
net/mlx5e: Introduce wrapper for xdp_buff
Preparation for implementing HW metadata kfuncs. No functional change.
Cc: Tariq Toukan <tariqt@nvidia.com> Cc: Saeed Mahameed <saeedm@nvidia.com> Cc: John
net/mlx5e: Introduce wrapper for xdp_buff
Preparation for implementing HW metadata kfuncs. No functional change.
Cc: Tariq Toukan <tariqt@nvidia.com> Cc: Saeed Mahameed <saeedm@nvidia.com> Cc: John Fastabend <john.fastabend@gmail.com> Cc: David Ahern <dsahern@gmail.com> Cc: Martin KaFai Lau <martin.lau@linux.dev> Cc: Jakub Kicinski <kuba@kernel.org> Cc: Willem de Bruijn <willemb@google.com> Cc: Jesper Dangaard Brouer <brouer@redhat.com> Cc: Anatoly Burakov <anatoly.burakov@intel.com> Cc: Alexander Lobakin <alexandr.lobakin@intel.com> Cc: Magnus Karlsson <magnus.karlsson@gmail.com> Cc: Maryam Tahhan <mtahhan@redhat.com> Cc: xdp-hints@xdp-project.net Cc: netdev@vger.kernel.org Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com> Signed-off-by: Stanislav Fomichev <sdf@google.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Link: https://lore.kernel.org/r/20230119221536.3349901-16-sdf@google.com Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
show more ...
|
Revision tags: v6.1.7, v6.1.6, v6.1.5, v6.0.19, v6.0.18, v6.1.4, v6.1.3, v6.0.17, v6.1.2, v6.0.16, v6.1.1, v6.0.15, v6.0.14, v6.0.13, v6.1, v6.0.12, v6.0.11, v6.0.10, v5.15.80, v6.0.9, v5.15.79, v6.0.8, v5.15.78, v6.0.7, v5.15.77, v5.15.76, v6.0.6, v6.0.5, v5.15.75, v6.0.4, v6.0.3, v6.0.2, v5.15.74, v5.15.73, v6.0.1, v5.15.72, v6.0, v5.15.71 |
|
#
e3c4c496 |
| 27-Sep-2022 |
Maxim Mikityanskiy <maximmi@nvidia.com> |
net/mlx5e: Fix a typo in mlx5e_xdp_mpwqe_is_full
Fix a typo in the function name: mpqwe -> mpwqe (stands for multi-packet work queue element).
Signed-off-by: Maxim Mikityanskiy <maximmi@nvidia.com>
net/mlx5e: Fix a typo in mlx5e_xdp_mpwqe_is_full
Fix a typo in the function name: mpqwe -> mpwqe (stands for multi-packet work queue element).
Signed-off-by: Maxim Mikityanskiy <maximmi@nvidia.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Reviewed-by: Saeed Mahameed <saeedm@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
show more ...
|
Revision tags: v5.15.70, v5.15.69, v5.15.68, v5.15.67, v5.15.66, v5.15.65, v5.15.64, v5.15.63, v5.15.62, v5.15.61, v5.15.60, v5.15.59, v5.19, v5.15.58, v5.15.57, v5.15.56, v5.15.55, v5.15.54, v5.15.53, v5.15.52, v5.15.51, v5.15.50, v5.15.49, v5.15.48, v5.15.47, v5.15.46, v5.15.45, v5.15.44, v5.15.43, v5.15.42, v5.18, v5.15.41, v5.15.40, v5.15.39, v5.15.38, v5.15.37, v5.15.36, v5.15.35, v5.15.34, v5.15.33, v5.15.32, v5.15.31, v5.17, v5.15.30, v5.15.29, v5.15.28, v5.15.27, v5.15.26, v5.15.25, v5.15.24 |
|
#
08c34e95 |
| 15-Feb-2022 |
Maxim Mikityanskiy <maximmi@nvidia.com> |
net/mlx5e: Remove MLX5E_XDP_TX_DS_COUNT
After introducing multi-buffer XDP_TX, the MLX5E_XDP_TX_DS_COUNT define became misleading. It's no longer the DS count of an XDP_TX WQE, this WQE can be longe
net/mlx5e: Remove MLX5E_XDP_TX_DS_COUNT
After introducing multi-buffer XDP_TX, the MLX5E_XDP_TX_DS_COUNT define became misleading. It's no longer the DS count of an XDP_TX WQE, this WQE can be longer because of fragments.
As this define is only used at one place in mlx5e_open_xdpsq(), it's also not very useful anymore. This commit removes the define and puts the calculation of ds_count for prefilled single-fragment WQEs inline.
Signed-off-by: Maxim Mikityanskiy <maximmi@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
show more ...
|
Revision tags: v5.15.23, v5.15.22, v5.15.21, v5.15.20, v5.15.19 |
|
#
39a1665d |
| 31-Jan-2022 |
Maxim Mikityanskiy <maximmi@nvidia.com> |
net/mlx5e: Implement sending multi buffer XDP frames
xmit_xdp_frame is extended to support sending fragmented XDP frames. The next commit will start using this functionality.
Signed-off-by: Maxim M
net/mlx5e: Implement sending multi buffer XDP frames
xmit_xdp_frame is extended to support sending fragmented XDP frames. The next commit will start using this functionality.
Signed-off-by: Maxim Mikityanskiy <maximmi@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
show more ...
|
Revision tags: v5.15.18 |
|
#
49529a17 |
| 28-Jan-2022 |
Maxim Mikityanskiy <maximmi@nvidia.com> |
net/mlx5e: Move mlx5e_xdpi_fifo_push out of xmit_xdp_frame
The implementations of xmit_xdp_frame get the xdpi parameter of type struct mlx5e_xdp_info for the sole purpose of calling mlx5e_xdpi_fifo_
net/mlx5e: Move mlx5e_xdpi_fifo_push out of xmit_xdp_frame
The implementations of xmit_xdp_frame get the xdpi parameter of type struct mlx5e_xdp_info for the sole purpose of calling mlx5e_xdpi_fifo_push() on success.
This commit moves this call outside of xmit_xdp_frame, shifting this responsibility to the caller. It will allow more fine-grained handling of XDP info for cases when an xdp_frame is fragmented.
Signed-off-by: Maxim Mikityanskiy <maximmi@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
show more ...
|
#
ddc87e7d |
| 28-Jan-2022 |
Maxim Mikityanskiy <maximmi@nvidia.com> |
net/mlx5e: Store DMA address inside struct page
Use page_pool_set_dma_addr() to store the DMA address of a page inside struct page, in order to avoid passing struct mlx5e_dma_info to XDP handlers. P
net/mlx5e: Store DMA address inside struct page
Use page_pool_set_dma_addr() to store the DMA address of a page inside struct page, in order to avoid passing struct mlx5e_dma_info to XDP handlers. Previously, struct mlx5e_dma_info was used to pass both the DMA address and the page, and it worked well for the single-fragment case.
When XDP multi buffer is in use, and a fragmented xdp_frame has to be transmitted, the driver needs to know the DMA addresses of fragments, however, the array of fragments in struct skb_shared_info doesn't contain them. In order to pass the DMA addresses, the driver puts them into struct page itself, which is accessible from the array of fragments in struct skb_shared_info. The existing XDP handlers are modified to remove the dependency on struct mlx5e_dma_info.
Signed-off-by: Maxim Mikityanskiy <maximmi@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
show more ...
|
#
064990d0 |
| 27-Jan-2022 |
Maxim Mikityanskiy <maximmi@nvidia.com> |
net/mlx5e: Drop the len output parameter from mlx5e_xdp_handle
The len parameter of mlx5e_xdp_handle is used to output the new packet length after XDP has processed the packet and returned XDP_PASS.
net/mlx5e: Drop the len output parameter from mlx5e_xdp_handle
The len parameter of mlx5e_xdp_handle is used to output the new packet length after XDP has processed the packet and returned XDP_PASS. However, this value can be calculated on the caller site, as the caller knows if it was an XDP_PASS.
This commit drops the len parameter and moves the calculation to the caller, reducing the number of parameters passed to the function and preparing for XDP support in non-linear legacy RQ.
Signed-off-by: Maxim Mikityanskiy <maximmi@nvidia.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
show more ...
|
Revision tags: v5.15.17, v5.4.173, v5.15.16 |
|
#
e26eceb9 |
| 19-Jan-2022 |
Tariq Toukan <tariqt@nvidia.com> |
net/mlx5e: RX, Test the XDP program existence out of the handler
Instead of early return inside mlx5e_xdp_handle(), let the caller check if an XDP program is loaded. This allows saving a few unnece
net/mlx5e: RX, Test the XDP program existence out of the handler
Instead of early return inside mlx5e_xdp_handle(), let the caller check if an XDP program is loaded. This allows saving a few unnecessary function calls and calculations in case !prog.
Performance test: single core, drop packets in iptables Before: 3,872,504 pps After: 3,975,628 pps (+2.66%)
Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Reviewed-by: Maxim Mikityanskiy <maximmi@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
show more ...
|
Revision tags: v5.15.15, v5.16, v5.15.10, v5.15.9, v5.15.8, v5.15.7, v5.15.6, v5.15.5, v5.15.4, v5.15.3, v5.15.2, v5.15.1, v5.15, v5.14.14, v5.14.13, v5.14.12, v5.14.11, v5.14.10, v5.14.9, v5.14.8, v5.14.7, v5.14.6, v5.10.67, v5.10.66, v5.14.5, v5.14.4, v5.10.65, v5.14.3, v5.10.64, v5.14.2, v5.10.63, v5.14.1, v5.10.62, v5.14, v5.10.61, v5.10.60, v5.10.53, v5.10.52, v5.10.51, v5.10.50, v5.10.49, v5.13, v5.10.46, v5.10.43, v5.10.42, v5.10.41, v5.10.40, v5.10.39, v5.4.119, v5.10.36 |
|
#
76c31e5f |
| 10-May-2021 |
Aya Levin <ayal@nvidia.com> |
net/mlx5e: Use FW limitation for max MPW WQEBBs
Calculate maximal count of MPW WQEBBs on SQ's creation and store it there. Remove MLX5E_TX_MPW_MAX_NUM_DS and MLX5E_TX_MPW_MAX_WQEBBS. Update mlx5e_tx
net/mlx5e: Use FW limitation for max MPW WQEBBs
Calculate maximal count of MPW WQEBBs on SQ's creation and store it there. Remove MLX5E_TX_MPW_MAX_NUM_DS and MLX5E_TX_MPW_MAX_WQEBBS. Update mlx5e_tx_mpwqe_is_full() and mlx5e_xdp_mpqwe_is_full() .
Signed-off-by: Aya Levin <ayal@nvidia.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
show more ...
|
Revision tags: v5.10.35, v5.10.34, v5.4.116, v5.10.33, v5.12, v5.10.32, v5.10.31, v5.10.30, v5.10.27, v5.10.26, v5.10.25, v5.10.24, v5.10.23, v5.10.22, v5.10.21, v5.10.20, v5.10.19, v5.4.101, v5.10.18, v5.10.17, v5.11, v5.10.16, v5.10.15, v5.10.14 |
|
#
4d6e6b0c |
| 01-Feb-2021 |
Maxim Mikityanskiy <maximmi@mellanox.com> |
net/mlx5e: Replace synchronize_rcu with synchronize_net
The commit cited below switched from using napi_synchronize to synchronize_rcu to have a guarantee that it will finish in finite time. However
net/mlx5e: Replace synchronize_rcu with synchronize_net
The commit cited below switched from using napi_synchronize to synchronize_rcu to have a guarantee that it will finish in finite time. However, on average, synchronize_rcu takes more time than napi_synchronize. Given that it's called multiple times per channel on deactivation, it accumulates to a significant amount, which causes timeouts in some applications (for example, when using bonding with NetworkManager).
This commit replaces synchronize_rcu with synchronize_net, which is faster when called under rtnl_lock, allowing to speed up the described flow.
Fixes: 9c25a22dfb00 ("net/mlx5e: Use synchronize_rcu to sync with NAPI") Signed-off-by: Maxim Mikityanskiy <maximmi@mellanox.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
show more ...
|
#
08b42b6f |
| 01-Feb-2021 |
Maxim Mikityanskiy <maximmi@mellanox.com> |
net/mlx5e: Replace synchronize_rcu with synchronize_net
[ Upstream commit 4d6e6b0c6d4bed8a7128500701354e2dc6098fa3 ]
The commit cited below switched from using napi_synchronize to synchronize_rcu t
net/mlx5e: Replace synchronize_rcu with synchronize_net
[ Upstream commit 4d6e6b0c6d4bed8a7128500701354e2dc6098fa3 ]
The commit cited below switched from using napi_synchronize to synchronize_rcu to have a guarantee that it will finish in finite time. However, on average, synchronize_rcu takes more time than napi_synchronize. Given that it's called multiple times per channel on deactivation, it accumulates to a significant amount, which causes timeouts in some applications (for example, when using bonding with NetworkManager).
This commit replaces synchronize_rcu with synchronize_net, which is faster when called under rtnl_lock, allowing to speed up the described flow.
Fixes: 9c25a22dfb00 ("net/mlx5e: Use synchronize_rcu to sync with NAPI") Signed-off-by: Maxim Mikityanskiy <maximmi@mellanox.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
show more ...
|
Revision tags: v5.10, v5.8.17, v5.8.16, v5.8.15, v5.9, v5.8.14, v5.8.13, v5.8.12, v5.8.11, v5.8.10, v5.8.9, v5.8.8, v5.8.7, v5.8.6, v5.4.62, v5.8.5, v5.8.4, v5.4.61, v5.8.3, v5.4.60, v5.8.2, v5.4.59, v5.8.1, v5.4.58, v5.4.57, v5.4.56, v5.8, v5.7.12, v5.4.55, v5.7.11, v5.4.54, v5.7.10, v5.4.53, v5.4.52, v5.7.9, v5.7.8, v5.4.51 |
|
#
5af75c74 |
| 02-Jul-2020 |
Maxim Mikityanskiy <maximmi@mellanox.com> |
net/mlx5e: Enhanced TX MPWQE for SKBs
This commit adds support for Enhanced TX MPWQE feature in the regular (SKB) data path. A MPWQE (multi-packet work queue element) can serve multiple packets, red
net/mlx5e: Enhanced TX MPWQE for SKBs
This commit adds support for Enhanced TX MPWQE feature in the regular (SKB) data path. A MPWQE (multi-packet work queue element) can serve multiple packets, reducing the PCI bandwidth on control traffic.
Two new stats (tx*_mpwqe_blks and tx*_mpwqe_pkts) are added. The feature is on by default and controlled by the skb_tx_mpwqe private flag.
In a MPWQE, eseg is shared among all packets, so eseg-based offloads (IPSEC, GENEVE, checksum) run on a separate eseg that is compared to the eseg of the current MPWQE session to decide if the new packet can be added to the same session.
MPWQE is not compatible with certain offloads and features, such as TLS offload, TSO, nonlinear SKBs. If such incompatible features are in use, the driver gracefully falls back to non-MPWQE.
This change has no performance impact in TCP single stream test and XDP_TX single stream test.
UDP pktgen, 64-byte packets, single stream, MPWQE off: Packet rate: 16.96 Mpps (±0.12 Mpps) -> 17.01 Mpps (±0.20 Mpps) Instructions per packet: 421 -> 429 Cycles per packet: 156 -> 161 Instructions per cycle: 2.70 -> 2.67
UDP pktgen, 64-byte packets, single stream, MPWQE on: Packet rate: 16.96 Mpps (±0.12 Mpps) -> 20.94 Mpps (±0.33 Mpps) Instructions per packet: 421 -> 329 Cycles per packet: 156 -> 123 Instructions per cycle: 2.70 -> 2.67
Enabling MPWQE can reduce PCI bandwidth: PCI Gen2, pktgen at fixed rate of 36864000 pps on 24 CPU cores: Inbound PCI utilization with MPWQE off: 80.3% Inbound PCI utilization with MPWQE on: 59.0% PCI Gen3, pktgen at fixed rate of 56064000 pps on 24 CPU cores: Inbound PCI utilization with MPWQE off: 65.4% Inbound PCI utilization with MPWQE on: 49.3%
Enabling MPWQE can also reduce CPU load, increasing the packet rate in case of CPU bottleneck: PCI Gen2, pktgen at full rate on 24 CPU cores: Packet rate with MPWQE off: 37.5 Mpps Packet rate with MPWQE on: 49.0 Mpps PCI Gen3, pktgen at full rate on 24 CPU cores: Packet rate with MPWQE off: 57.0 Mpps Packet rate with MPWQE on: 66.8 Mpps
Burst size in all pktgen tests is 32.
CPU: Intel(R) Xeon(R) CPU E5-2680 v3 @ 2.50GHz (x86_64) NIC: Mellanox ConnectX-6 Dx GCC 10.2.0
Signed-off-by: Maxim Mikityanskiy <maximmi@mellanox.com> Reviewed-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
show more ...
|
Revision tags: v5.4.50, v5.7.7, v5.4.49, v5.7.6, v5.7.5, v5.4.48, v5.7.4, v5.7.3, v5.4.47, v5.4.46, v5.7.2, v5.4.45, v5.7.1, v5.4.44, v5.7, v5.4.43, v5.4.42, v5.4.41, v5.4.40, v5.4.39, v5.4.38, v5.4.37, v5.4.36, v5.4.35, v5.4.34, v5.4.33 |
|
#
b39fe61e |
| 16-Apr-2020 |
Maxim Mikityanskiy <maximmi@mellanox.com> |
net/mlx5e: Rename xmit-related structs to generalize them
As preparation for the upcoming TX MPWQE support for SKBs, rename struct mlx5e_xdp_mpwqe to mlx5e_tx_mpwqe and move it above struct mlx5e_tx
net/mlx5e: Rename xmit-related structs to generalize them
As preparation for the upcoming TX MPWQE support for SKBs, rename struct mlx5e_xdp_mpwqe to mlx5e_tx_mpwqe and move it above struct mlx5e_txqsq. This structure will be reused in the regular SQ and in the regular TX data path. Also rename mlx5e_xdp_xmit_data to mlx5e_xmit_data - it will be used in the upcoming TX MPWQE flow.
Signed-off-by: Maxim Mikityanskiy <maximmi@mellanox.com> Reviewed-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
show more ...
|
#
530d5ce2 |
| 04-Jun-2020 |
Maxim Mikityanskiy <maximmi@mellanox.com> |
net/mlx5e: Generalize TX MPWQE checks for full session
As preparation for the upcoming TX MPWQE for SKBs, create a function (mlx5e_tx_mpwqe_is_full) to check whether an MPWQE session is full. This f
net/mlx5e: Generalize TX MPWQE checks for full session
As preparation for the upcoming TX MPWQE for SKBs, create a function (mlx5e_tx_mpwqe_is_full) to check whether an MPWQE session is full. This function will be shared by MPWQE code for XDP and for SKBs. Defines are renamed and moved to make them not XDP-specific.
Signed-off-by: Maxim Mikityanskiy <maximmi@mellanox.com> Reviewed-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
show more ...
|
#
97e3afd6 |
| 09-Jul-2020 |
Maxim Mikityanskiy <maximmi@mellanox.com> |
net/mlx5e: Unify constants for WQE_EMPTY_DS_COUNT
A constant for the number of DS in an empty WQE (i.e. a WQE without data segments) is needed in multiple places (normal TX data path, MPWQE in XDP),
net/mlx5e: Unify constants for WQE_EMPTY_DS_COUNT
A constant for the number of DS in an empty WQE (i.e. a WQE without data segments) is needed in multiple places (normal TX data path, MPWQE in XDP), but currently we have a constant for XDP and an inline formula in normal TX. This patch introduces a common constant.
Additionally, mlx5e_xdp_mpwqe_session_start is converted to use struct assignment, because the code nearby is touched.
Signed-off-by: Maxim Mikityanskiy <maximmi@mellanox.com> Reviewed-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
show more ...
|
#
388a2b56 |
| 30-Jul-2020 |
Maxim Mikityanskiy <maximmi@mellanox.com> |
net/mlx5e: Small improvements for XDP TX MPWQE logic
Use MLX5E_XDP_MPW_MAX_WQEBBS to reserve space for a MPWQE, because it's actually the maximal size a MPWQE can take.
Reorganize the logic that ch
net/mlx5e: Small improvements for XDP TX MPWQE logic
Use MLX5E_XDP_MPW_MAX_WQEBBS to reserve space for a MPWQE, because it's actually the maximal size a MPWQE can take.
Reorganize the logic that checks when to close the MPWQE session:
1. Put all checks into a single function.
2. When inline is on, make only one comparison - if it's false, the less strict one will also be false. The compiler probably optimized it out anyway, but it's clearer to also reflect it in the code.
The MLX5E_XDP_INLINE_WQE_* defines are also changed to make the calculations more correct from the logical point of view. Though MLX5E_XDP_INLINE_WQE_MAX_DS_CNT used to be 16 and didn't change its value, the calculation used to be DIV_ROUND_UP(max inline packet size, MLX5_SEND_WQE_DS), and the numerator should have included sizeof(struct mlx5_wqe_inline_seg).
Signed-off-by: Maxim Mikityanskiy <maximmi@mellanox.com> Reviewed-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
show more ...
|
#
93761ca1 |
| 30-Apr-2020 |
Tariq Toukan <tariqt@mellanox.com> |
net/mlx5e: XDP, Avoid indirect call in TX flow
Use INDIRECT_CALL_2() helper to avoid the cost of the indirect call when/if CONFIG_RETPOLINE=y.
Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Revi
net/mlx5e: XDP, Avoid indirect call in TX flow
Use INDIRECT_CALL_2() helper to avoid the cost of the indirect call when/if CONFIG_RETPOLINE=y.
Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Reviewed-by: Maxim Mikityanskiy <maximmi@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
show more ...
|
#
39d6443c |
| 20-May-2020 |
Björn Töpel <bjorn.topel@intel.com> |
mlx5, xsk: Migrate to new MEM_TYPE_XSK_BUFF_POOL
Use the new MEM_TYPE_XSK_BUFF_POOL API in lieu of MEM_TYPE_ZERO_COPY in mlx5e. It allows to drop a lot of code from the driver (which is now common i
mlx5, xsk: Migrate to new MEM_TYPE_XSK_BUFF_POOL
Use the new MEM_TYPE_XSK_BUFF_POOL API in lieu of MEM_TYPE_ZERO_COPY in mlx5e. It allows to drop a lot of code from the driver (which is now common in AF_XDP core and was related to XSK RX frame allocation, DMA mapping, etc.) and slightly improve performance (RX +0.8 Mpps, TX +0.4 Mpps).
rfc->v1: Put back the sanity check for XSK params, use XSK API to get the total headroom size. (Maxim)
v1->v2: Fix DMA address handling, set XDP metadata to invalid. (Maxim)
v2->v3: Handle frame_sz, use xsk_buff_xdp_get_frame_dma, use xsk_buff API for DMA sync on TX, add performance numbers. (Maxim)
v3->v4: Remove unused variable num_xsk_frames. (Jakub)
Signed-off-by: Björn Töpel <bjorn.topel@intel.com> Signed-off-by: Maxim Mikityanskiy <maximmi@mellanox.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20200520192103.355233-12-bjorn.topel@gmail.com
show more ...
|
Revision tags: v5.4.32, v5.4.31, v5.4.30, v5.4.29 |
|
#
5ffb4d85 |
| 30-Mar-2020 |
Maxim Mikityanskiy <maximmi@mellanox.com> |
net/mlx5e: Calculate SQ stop room in a robust way
Currently, different formulas are used to estimate the space that may be taken by WQEs in the SQ during a single packet transmit. This space is call
net/mlx5e: Calculate SQ stop room in a robust way
Currently, different formulas are used to estimate the space that may be taken by WQEs in the SQ during a single packet transmit. This space is called stop room, and it's checked in the end of packet transmit to find out if the next packet could overflow the SQ. If it could, the driver tells the kernel to stop sending next packets.
Many factors affect the stop room:
1. Padding with NOPs to avoid WQEs spanning over page boundaries.
2. Enabled and disabled offloads (TLS, upcoming MPWQE).
3. The maximum size of a WQE.
The padding is performed before every WQE if it doesn't fit the current page.
The current formula assumes that only one padding will be required per packet, and it doesn't take into account that the WQEs posted during the transmission of a single packet might exceed the page size in very rare circumstances. For example, to hit this condition with 4096-byte pages, TLS offload will have to interrupt an almost-full MPWQE session, be in the resync flow and try to transmit a near to maximum amount of data.
To avoid SQ overflows in such rare cases after MPWQE is added, this patch introduces a more robust formula to estimate the stop room. The new formula uses the fact that a WQE of size X will not require more than X-1 WQEBBs of padding. More exact estimations are possible, but they result in much more complex and error-prone code for little gain.
Before this patch, the TLS stop room included space for both INNOVA and ConnectX TLS offloads that couldn't run at the same time anyway, so this patch accounts only for the active one.
Signed-off-by: Maxim Mikityanskiy <maximmi@mellanox.com> Reviewed-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
show more ...
|