Revision tags: v6.6.25, v6.6.24, v6.6.23, v6.6.16, v6.6.15, v6.6.14, v6.6.13, v6.6.12, v6.6.11, v6.6.10, v6.6.9, v6.6.8, v6.6.7 |
|
#
26835308 |
| 12-Dec-2023 |
Leon Romanovsky <leonro@nvidia.com> |
net/mlx5e: Allow software parsing when IPsec crypto is enabled
[ Upstream commit 20f5468a7988dedd94a57ba8acd65ebda6a59723 ]
All ConnectX devices have software parsing capability enabled, but it is
net/mlx5e: Allow software parsing when IPsec crypto is enabled
[ Upstream commit 20f5468a7988dedd94a57ba8acd65ebda6a59723 ]
All ConnectX devices have software parsing capability enabled, but it is more correct to set allow_swp only if capability exists, which for IPsec means that crypto offload is supported.
Fixes: 2451da081a34 ("net/mlx5: Unify device IPsec capabilities check") Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
show more ...
|
Revision tags: v6.6.6, v6.6.5, v6.6.4, v6.6.3, v6.6.2, v6.5.11, v6.6.1, v6.5.10, v6.6, v6.5.9, v6.5.8, v6.5.7, v6.5.6, v6.5.5, v6.5.4, v6.5.3, v6.5.2, v6.1.51, v6.5.1, v6.1.50, v6.5, v6.1.49, v6.1.48, v6.1.46, v6.1.45, v6.1.44 |
|
#
a9ca9f9c |
| 04-Aug-2023 |
Yunsheng Lin <linyunsheng@huawei.com> |
page_pool: split types and declarations from page_pool.h
Split types and pure function declarations from page_pool.h and add them in page_page/types.h, so that C sources can include page_pool.h and
page_pool: split types and declarations from page_pool.h
Split types and pure function declarations from page_pool.h and add them in page_page/types.h, so that C sources can include page_pool.h and headers should generally only include page_pool/types.h as suggested by jakub. Rename page_pool.h to page_pool/helpers.h to have both in one place.
Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com> Suggested-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Alexander Lobakin <aleksander.lobakin@intel.com> Reviewed-by: Alexander Duyck <alexanderduyck@fb.com> Link: https://lore.kernel.org/r/20230804180529.2483231-2-aleksander.lobakin@intel.com [Jakub: change microsoft/mana, fix kdoc paths in Documentation] Signed-off-by: Jakub Kicinski <kuba@kernel.org>
show more ...
|
Revision tags: v6.1.43, v6.1.42, v6.1.41, v6.1.40, v6.1.39, v6.1.38, v6.1.37, v6.1.36, v6.4, v6.1.35, v6.1.34 |
|
#
4e7401fc |
| 14-Jun-2023 |
Maxim Mikityanskiy <maxtram95@gmail.com> |
net/mlx5e: XDP, Allow growing tail for XDP multi buffer
The cited commits missed passing frag_size to __xdp_rxq_info_reg, which is required by bpf_xdp_adjust_tail to support growing the tail pointer
net/mlx5e: XDP, Allow growing tail for XDP multi buffer
The cited commits missed passing frag_size to __xdp_rxq_info_reg, which is required by bpf_xdp_adjust_tail to support growing the tail pointer in fragmented packets. Pass the missing parameter when the current RQ mode allows XDP multi buffer.
Fixes: ea5d49bdae8b ("net/mlx5e: Add XDP multi buffer support to the non-linear legacy RQ") Fixes: 9cb9482ef10e ("net/mlx5e: Use fragments of the same size in non-linear legacy RQ with XDP") Signed-off-by: Maxim Mikityanskiy <maxtram95@gmail.com> Cc: Tariq Toukan <tariqt@nvidia.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
show more ...
|
Revision tags: v6.1.33, v6.1.32, v6.1.31, v6.1.30, v6.1.29, v6.1.28, v6.1.27, v6.1.26, v6.3, v6.1.25 |
|
#
f52ac702 |
| 17-Apr-2023 |
Tariq Toukan <tariqt@nvidia.com> |
net/mlx5e: RX, Add XDP multi-buffer support in Striding RQ
Here we add support for multi-buffer XDP handling in Striding RQ, which is our default out-of-the-box RQ type. Before this series, loading
net/mlx5e: RX, Add XDP multi-buffer support in Striding RQ
Here we add support for multi-buffer XDP handling in Striding RQ, which is our default out-of-the-box RQ type. Before this series, loading such an XDP program would fail, until you switch to the legacy RQ (by unsetting the rx_striding_rq priv-flag).
To overcome the lack of headroom and tailroom between the strides, we allocate a side page to be used for the descriptor (xdp_buff / skb) and the linear part. When an XDP program is attached, we structure the xdp_buff so that it contains no data in the linear part, and the whole packet resides in the fragments.
In case of XDP_PASS, where an SKB still needs to be created, we copy up to 256 bytes to its linear part, to match the current behavior, and satisfy functions that assume finding the packet headers in the SKB linear part (like eth_type_trans).
Performance testing:
Packet rate test, 64 bytes, 32 channels, MTU 9000 bytes. CPU: Intel(R) Xeon(R) Platinum 8380 CPU @ 2.30GHz. NIC: ConnectX-6 Dx, at 100 Gbps.
+----------+-------------+-------------+---------+ | Test | Legacy RQ | Striding RQ | Speedup | +----------+-------------+-------------+---------+ | XDP_DROP | 101,615,544 | 117,191,020 | +15% | +----------+-------------+-------------+---------+ | XDP_TX | 95,608,169 | 117,043,422 | +22% | +----------+-------------+-------------+---------+
Reviewed-by: Saeed Mahameed <saeedm@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
show more ...
|
#
20409abe |
| 17-Apr-2023 |
Tariq Toukan <tariqt@nvidia.com> |
net/mlx5e: XDP, Consider large muti-buffer packets in Striding RQ params calculations
Function mlx5e_rx_get_linear_stride_sz() returns PAGE_SIZE immediately in case an XDP program is attached. The m
net/mlx5e: XDP, Consider large muti-buffer packets in Striding RQ params calculations
Function mlx5e_rx_get_linear_stride_sz() returns PAGE_SIZE immediately in case an XDP program is attached. The more accurate formula is ALIGN(sz, PAGE_SIZE), to prevent two packets from residing on the same page.
The assumption behind the current code is that sz <= PAGE_SIZE holds for all cases with XDP program set.
This is true because it is being called from: - 3 times from Striding RQ flows, in which XDP is not supported for such large packets. - 1 time from Legacy RQ flow, under the condition mlx5e_rx_is_linear_skb().
No functional change here, just removing the implied assumption in preparation for supporting XDP multi-buffer in Striding RQ.
Reviewed-by: Saeed Mahameed <saeedm@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
show more ...
|
Revision tags: v6.1.24 |
|
#
ed4b0661 |
| 10-Apr-2023 |
Avihai Horon <avihaih@nvidia.com> |
RDMA/mlx5: Remove pcie_relaxed_ordering_enabled() check for RO write
pcie_relaxed_ordering_enabled() check was added to avoid a syndrome when creating a MKey with relaxed ordering (RO) enabled when
RDMA/mlx5: Remove pcie_relaxed_ordering_enabled() check for RO write
pcie_relaxed_ordering_enabled() check was added to avoid a syndrome when creating a MKey with relaxed ordering (RO) enabled when the driver's relaxed_ordering_{read,write} HCA capabilities are out of sync with FW.
While this can happen with relaxed_ordering_read, it can't happen with relaxed_ordering_write as it's set if the device supports RO write, regardless of RO in PCI config space, and thus can't change during runtime.
Therefore, drop the pcie_relaxed_ordering_enabled() check for relaxed_ordering_write while keeping it for relaxed_ordering_read. Doing so will also allow the usage of RO write in VFs and VMs (where RO in PCI config space is not reported/emulated properly).
Signed-off-by: Avihai Horon <avihaih@nvidia.com> Reviewed-by: Shay Drory <shayd@nvidia.com> Link: https://lore.kernel.org/r/7e8f55e31572c1702d69cae015a395d3a824a38a.1681131553.git.leon@kernel.org Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Signed-off-by: Leon Romanovsky <leon@kernel.org>
show more ...
|
Revision tags: v6.1.23, v6.1.22, v6.1.21, v6.1.20, v6.1.19, v6.1.18, v6.1.17, v6.1.16, v6.1.15, v6.1.14, v6.1.13 |
|
#
cd640b05 |
| 21-Feb-2023 |
Dragos Tatulea <dtatulea@nvidia.com> |
net/mlx5e: RX, Break the wqe bulk refill in smaller chunks
To avoid overflowing the page pool's cache, don't release the whole bulk which is usually larger than the cache refill size. Group release+
net/mlx5e: RX, Break the wqe bulk refill in smaller chunks
To avoid overflowing the page pool's cache, don't release the whole bulk which is usually larger than the cache refill size. Group release+alloc instead into cache refill units that allow releasing to the cache and then allocating from the cache.
A refill_unit variable is added as a iteration unit over the wqe_bulk when doing release+alloc.
For a single ring, single core, default MTU (1500) TCP stream test the number of pages allocated from the cache directly (rx_pp_recycle_cached) increases from 0% to 52%:
+---------------------------------------------+ | Page Pool stats (/sec) | Before | After | +-------------------------+---------+---------+ |rx_pp_alloc_fast | 2145422 | 2193802 | |rx_pp_alloc_slow | 2 | 0 | |rx_pp_alloc_empty | 2 | 0 | |rx_pp_alloc_refill | 34059 | 16634 | |rx_pp_alloc_waive | 0 | 0 | |rx_pp_recycle_cached | 0 | 1145818 | |rx_pp_recycle_cache_full | 0 | 0 | |rx_pp_recycle_ring | 2179361 | 1064616 | |rx_pp_recycle_ring_full | 121 | 0 | +---------------------------------------------+
With this patch, the performance for legacy rq for the above test is back to baseline.
Signed-off-by: Dragos Tatulea <dtatulea@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
show more ...
|
#
4ba2b498 |
| 21-Feb-2023 |
Dragos Tatulea <dtatulea@nvidia.com> |
net/mlx5e: RX, Increase WQE bulk size for legacy rq
Deferred page release was added to legacy rq but its desired effect (driver releases last fragment to page pool cache) is not yet visible due to t
net/mlx5e: RX, Increase WQE bulk size for legacy rq
Deferred page release was added to legacy rq but its desired effect (driver releases last fragment to page pool cache) is not yet visible due to the WQE bulks being too small.
This patch increases the WQE bulk size to span 512 KB and clip it to one quarter of the rx queue size.
Signed-off-by: Dragos Tatulea <dtatulea@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
show more ...
|
#
028522e2 |
| 14-Mar-2023 |
Gal Pressman <gal@nvidia.com> |
net/mlx5: Move needed PTYS functions to core layer
Downstream patches require devlink params to access the PTYS register, move the needed functions from mlx5e to the core layer.
Signed-off-by: Gal
net/mlx5: Move needed PTYS functions to core layer
Downstream patches require devlink params to access the PTYS register, move the needed functions from mlx5e to the core layer.
Signed-off-by: Gal Pressman <gal@nvidia.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com> Link: https://lore.kernel.org/r/20230314054234.267365-11-saeed@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
show more ...
|
Revision tags: v6.2, v6.1.12, v6.1.11, v6.1.10, v6.1.9, v6.1.8, v6.1.7 |
|
#
b80ae281 |
| 15-Jan-2023 |
Adham Faris <afaris@nvidia.com> |
net/mlx5e: Add warning when log WQE size is smaller than log stride size
Add warning macro in the function mlx5e_mpwqe_get_log_num_strides() when log WQE size is smaller than log stride size. Theore
net/mlx5e: Add warning when log WQE size is smaller than log stride size
Add warning macro in the function mlx5e_mpwqe_get_log_num_strides() when log WQE size is smaller than log stride size. Theoretically this should not happen.
Signed-off-by: Adham Faris <afaris@nvidia.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
show more ...
|
Revision tags: v6.1.6, v6.1.5, v6.0.19 |
|
#
130b1207 |
| 08-Jan-2023 |
Adham Faris <afaris@nvidia.com> |
net/mlx5e: Fail with messages when params are not valid for XSK
Current XSK prerequisites validation implementation (setup.c/mlx5e_validate_xsk_param()) fails silently when xsk prerequisites are not
net/mlx5e: Fail with messages when params are not valid for XSK
Current XSK prerequisites validation implementation (setup.c/mlx5e_validate_xsk_param()) fails silently when xsk prerequisites are not fulfilled. Add error messages to the kernel log to help the user understand what went wrong when params are not valid for XSK.
Signed-off-by: Adham Faris <afaris@nvidia.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
show more ...
|
#
6624bfee |
| 12-Jan-2023 |
Adham Faris <afaris@nvidia.com> |
net/mlx5e: Remove redundant xsk pointer check in mlx5e_mpwrq_validate_xsk
This validation function is relevant only for XSK cases, hence it assumes to be called only with xsk != NULL. Thus checking
net/mlx5e: Remove redundant xsk pointer check in mlx5e_mpwrq_validate_xsk
This validation function is relevant only for XSK cases, hence it assumes to be called only with xsk != NULL. Thus checking for invalid xsk pointer is redundant and misleads static code analyzers. This commit removes redundant xsk pointer check.
This solves the following smatch warning: drivers/net/ethernet/mellanox/mlx5/core/en/params.c:481 mlx5e_mpwrq_validate_xsk() error: we previously assumed 'xsk' could be null (see line 478)
Fixes: 6470d2e7e8ed ("net/mlx5e: xsk: Use KSM for unaligned XSK") Signed-off-by: Adham Faris <afaris@nvidia.com> Reported-by: kernel test robot <lkp@intel.com> Reported-by: Dan Carpenter <error27@gmail.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
show more ...
|
Revision tags: v6.0.18, v6.1.4, v6.1.3, v6.0.17, v6.1.2, v6.0.16, v6.1.1, v6.0.15, v6.0.14, v6.0.13, v6.1, v6.0.12, v6.0.11, v6.0.10, v5.15.80, v6.0.9, v5.15.79, v6.0.8, v5.15.78, v6.0.7, v5.15.77 |
|
#
02648b4b |
| 31-Oct-2022 |
Tariq Toukan <tariqt@nvidia.com> |
net/mlx5: Generalize name of UMR alignment definition
Per the device spec, MLX5_UMR_MTT_ALIGNMENT is good not only for UMR MTT entries, but for all other entries as well, like KLMs and KSMs.
Signed
net/mlx5: Generalize name of UMR alignment definition
Per the device spec, MLX5_UMR_MTT_ALIGNMENT is good not only for UMR MTT entries, but for all other entries as well, like KLMs and KSMs.
Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Reviewed-by: Gal Pressman <gal@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
show more ...
|
Revision tags: v5.15.76, v6.0.6, v6.0.5, v5.15.75, v6.0.4, v6.0.3, v6.0.2, v5.15.74, v5.15.73, v6.0.1, v5.15.72, v6.0, v5.15.71, v5.15.70, v5.15.69, v5.15.68, v5.15.67, v5.15.66, v5.15.65, v5.15.64, v5.15.63, v5.15.62, v5.15.61, v5.15.60, v5.15.59, v5.19, v5.15.58, v5.15.57, v5.15.56, v5.15.55, v5.15.54, v5.15.53, v5.15.52, v5.15.51, v5.15.50, v5.15.49, v5.15.48, v5.15.47, v5.15.46, v5.15.45, v5.15.44, v5.15.43, v5.15.42, v5.18 |
|
#
38438d39 |
| 18-May-2022 |
Tariq Toukan <tariqt@nvidia.com> |
net/mlx5e: Move params kernel log print to probe function
Params info print was meant to be printed on load. With time, new calls to mlx5e_init_rq_type_params and mlx5e_build_rq_params were added, m
net/mlx5e: Move params kernel log print to probe function
Params info print was meant to be printed on load. With time, new calls to mlx5e_init_rq_type_params and mlx5e_build_rq_params were added, mistakenly printing the params once again.
Move the print to were it belongs, in mlx5e_probe.
Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Reviewed-by: Gal Pressman <gal@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
show more ...
|
Revision tags: v5.15.41, v5.15.40, v5.15.39, v5.15.38, v5.15.37, v5.15.36, v5.15.35, v5.15.34, v5.15.33, v5.15.32, v5.15.31, v5.17, v5.15.30, v5.15.29, v5.15.28, v5.15.27, v5.15.26, v5.15.25, v5.15.24, v5.15.23, v5.15.22, v5.15.21, v5.15.20, v5.15.19, v5.15.18, v5.15.17, v5.4.173, v5.15.16, v5.15.15, v5.16, v5.15.10, v5.15.9, v5.15.8, v5.15.7, v5.15.6, v5.15.5, v5.15.4, v5.15.3, v5.15.2, v5.15.1, v5.15, v5.14.14, v5.14.13, v5.14.12, v5.14.11, v5.14.10, v5.14.9, v5.14.8, v5.14.7, v5.14.6, v5.10.67, v5.10.66, v5.14.5, v5.14.4, v5.10.65, v5.14.3, v5.10.64, v5.14.2, v5.10.63, v5.14.1, v5.10.62, v5.14, v5.10.61, v5.10.60, v5.10.53, v5.10.52, v5.10.51, v5.10.50, v5.10.49, v5.13, v5.10.46, v5.10.43, v5.10.42, v5.10.41, v5.10.40, v5.10.39, v5.4.119, v5.10.36, v5.10.35, v5.10.34, v5.4.116, v5.10.33, v5.12, v5.10.32, v5.10.31, v5.10.30, v5.10.27, v5.10.26, v5.10.25, v5.10.24, v5.10.23, v5.10.22, v5.10.21, v5.10.20, v5.10.19, v5.4.101, v5.10.18, v5.10.17, v5.11, v5.10.16, v5.10.15 |
|
#
2c925db0 |
| 09-Feb-2021 |
Ofer Levi <oferle@nvidia.com> |
net/mlx5e: Support enhanced CQE compression
CQE compression feature improves performance by reducing PCI bandwidth bottleneck on CQEs write. Enhanced CQE compression introduced in ConnectX-6 and it
net/mlx5e: Support enhanced CQE compression
CQE compression feature improves performance by reducing PCI bandwidth bottleneck on CQEs write. Enhanced CQE compression introduced in ConnectX-6 and it aims to reduce CPU utilization of SW side packets decompression by eliminating the need to rewrite ownership bit, which is likely to cost a cache-miss, is replaced by validity byte handled solely by HW. Another advantage of the enhanced feature is that session packets are available to SW as soon as a single CQE slot is filled, instead of waiting for session to close, this improves packet latency from NIC to host.
Performance: Following are tested scenarios and reults comparing basic and enahnced CQE compression.
setup: IXIA 100GbE connected directly to port 0 and port 1 of ConnectX-6 Dx 100GbE dual port.
Case #1 RX only, single flow goes to single queue: IRQ rate reduced by ~ 30%, CPU utilization improved by 2%.
Case #2 IP forwarding from port 1 to port 0 single flow goes to single queue: Avg latency improved from 60us to 21us, frame loss improved from 0.5% to 0.0%.
Case #3 IP forwarding from port 1 to port 0 Max Throughput IXIA sends 100%, 8192 UDP flows, goes to 24 queues: Enhanced is equal or slightly better than basic.
Testing the basic compression feature with this patch shows there is no perfrormance degradation of the basic compression feature.
Signed-off-by: Ofer Levi <oferle@nvidia.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
show more ...
|
#
c2c9e31d |
| 01-Oct-2022 |
Maxim Mikityanskiy <maximmi@nvidia.com> |
net/mlx5e: xsk: Optimize for unaligned mode with 3072-byte frames
When XSK frame size is 3072 (or another power of two multiplied by 3), KLM mechanism for NIC virtual memory page mapping can be opti
net/mlx5e: xsk: Optimize for unaligned mode with 3072-byte frames
When XSK frame size is 3072 (or another power of two multiplied by 3), KLM mechanism for NIC virtual memory page mapping can be optimized by replacing it with KSM.
Before this change, two KLM entries were needed to map an XSK frame that is not a power of two: one entry maps the UMEM memory up to the frame length, the other maps the rest of the stride to the garbage page.
When the frame length divided by 3 is a power of two, it can be mapped using 3 KSM entries, and the fourth will map the rest of the stride to the garbage page. All 4 KSM entries are of the same size, which allows for a much faster lookup.
Frame size 3072 is useful in certain use cases, because it allows packing 4 frames into 3 pages. Generally speaking, other frame sizes equal to PAGE_SIZE minus a power of two can be optimized in a similar way, but it will require many more KSMs per frame, which slows down UMRs a little bit, but more importantly may hit the limit for the maximum number of KSM entries.
Signed-off-by: Maxim Mikityanskiy <maximmi@nvidia.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
show more ...
|
#
13921345 |
| 01-Oct-2022 |
Maxim Mikityanskiy <maximmi@nvidia.com> |
net/mlx5e: xsk: Use KLM to protect frame overrun in unaligned mode
XSK RQs support striding RQ linear mode, but the stride size may be bigger than the XSK frame size, because:
1. The stride size mu
net/mlx5e: xsk: Use KLM to protect frame overrun in unaligned mode
XSK RQs support striding RQ linear mode, but the stride size may be bigger than the XSK frame size, because:
1. The stride size must be a power of two.
2. The stride size must be equal to the UMR page size. Each XSK frame is treated as a separate page, because they aren't necessarily adjacent in physical memory, so the driver can't put more than one stride per page.
3. The minimal MTT page size is 4096 on older firmware.
That means that if XSK frame size is 2048 or not a power of two, the strides may be bigger than XSK frames. Normally, it's not a problem if the hardware enforces the MTU. However, traffic between vports skips the hardware MTU check, and oversized packets may be received.
If an oversized packet is bigger than the XSK frame but not bigger than the stride, it will cause overwriting of the adjacent UMEM region. If the packet takes more than one stride, they can be recycled for reuse, so it's not a problem when the XSK frame size matches the stride size.
Work around the above issue by leveraging KLM to make a more fine-grained mapping. The beginning of each stride is mapped to the frame memory, and the padding up to the closest power of two is mapped to the overflow page that doesn't belong to UMEM. This way, application data corruption won't happen upon receiving packets bigger than MTU.
Signed-off-by: Maxim Mikityanskiy <maximmi@nvidia.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
show more ...
|
#
9f123f74 |
| 01-Oct-2022 |
Maxim Mikityanskiy <maximmi@nvidia.com> |
net/mlx5e: Improve MTT/KSM alignment
Make mlx5e_mpwrq_mtts_per_wqe take into account that KSM requires smaller alignment than MTT.
Ensure that there is always an even amount of MTTs in a UMR WQE, s
net/mlx5e: Improve MTT/KSM alignment
Make mlx5e_mpwrq_mtts_per_wqe take into account that KSM requires smaller alignment than MTT.
Ensure that there is always an even amount of MTTs in a UMR WQE, so that complete octwords are formed, and no garbage is mapped.
Drop extra alignment in MLX5_MTT_OCTW that may cause setting too big ucseg->xlt_octowords, also leading to mapping garbage.
Generalize some calculations by introducing the MLX5_OCTWORD constant.
Signed-off-by: Maxim Mikityanskiy <maximmi@nvidia.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
show more ...
|
#
168723c1 |
| 01-Oct-2022 |
Maxim Mikityanskiy <maximmi@nvidia.com> |
net/mlx5e: xsk: Use umr_mode to calculate striding RQ parameters
Instead of passing the unaligned flag, pass an enum that indicates the UMR mode. The next commit will add the third mode (KLM for cer
net/mlx5e: xsk: Use umr_mode to calculate striding RQ parameters
Instead of passing the unaligned flag, pass an enum that indicates the UMR mode. The next commit will add the third mode (KLM for certain configurations of XSK), which will be added to this enum instead of adding another bool flag everywhere.
Signed-off-by: Maxim Mikityanskiy <maximmi@nvidia.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
show more ...
|
#
5758c314 |
| 30-Sep-2022 |
Maxim Mikityanskiy <maximmi@nvidia.com> |
net/mlx5e: Make the wqe_index_mask calculation more exact
The old calculation of wqe_index_mask may give false positives, i.e. request bulking of pairs of WQEs when not strictly needed, for example,
net/mlx5e: Make the wqe_index_mask calculation more exact
The old calculation of wqe_index_mask may give false positives, i.e. request bulking of pairs of WQEs when not strictly needed, for example, when the first fragment size is equal to the PAGE_SIZE, bulking is not needed, even if the number of fragments is odd.
Make the calculation more exact to cut false positives.
Signed-off-by: Maxim Mikityanskiy <maximmi@nvidia.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
show more ...
|
#
a064c609 |
| 30-Sep-2022 |
Maxim Mikityanskiy <maximmi@nvidia.com> |
net/mlx5e: Introduce wqe_index_mask for legacy RQ
When fragments of different WQEs share the same page, mlx5e_post_rx_wqes must wait until the old WQE stops using the page, only then the new WQE can
net/mlx5e: Introduce wqe_index_mask for legacy RQ
When fragments of different WQEs share the same page, mlx5e_post_rx_wqes must wait until the old WQE stops using the page, only then the new WQE can allocate the new page. Essentially, it means that if WQE index i is still in use, the allocation must stop before `i % bulk`, where bulk is the number of WQEs that may share the same page.
As bulk is always a power of two, `i % bulk = i & (bulk - 1)`, and the new wqe_index_mask field will be equal to `bulk - 1`.
At the same time, wqe_bulk remains for optimization purposes and stores `max(bulk, 8)`, which allows to skip the allocation until we have at least 8 WQEs free.
Signed-off-by: Maxim Mikityanskiy <maximmi@nvidia.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
show more ...
|
#
0b9c86c7 |
| 29-Sep-2022 |
Maxim Mikityanskiy <maximmi@nvidia.com> |
net/mlx5e: Fix calculations for ICOSQ size
WQEs must not cross page boundaries, they are padded with NOPs if they don't fit the page. mlx5e_mpwrq_total_umr_wqebbs doesn't take into account this padd
net/mlx5e: Fix calculations for ICOSQ size
WQEs must not cross page boundaries, they are padded with NOPs if they don't fit the page. mlx5e_mpwrq_total_umr_wqebbs doesn't take into account this padding, risking reserving not enough space.
The padding is not straightforward to add to this calculation, because WQEs of different sizes may be mixed together in the queue. If each page ends with a big WQE that doesn't fit and requires at most its size minus 1 WQEBB of padding, the total space can be much bigger than in case when smaller WQEs take advantage of this padding.
Replace the wrong exact calculation by the following estimation. Each padding can be at most the size of the maximum WQE used in the queue minus one WQEBB. Let's call the rest of the page "useful space". If we divide the total size of all needed WQEs by this useful space, rounding up, we'll get the number of pages, which is enough to contain all these WQEs. It's correct, because every WQE that appeared on the boundary between two blocks of useful space would start in the useful space of one page and end in the padding of the same page, while our estimation reserved space for its tail in the next space, making the estimation not smaller than the real space occupied in the queue.
The code actually uses a looser estimation: instead of taking the maximum size of all used WQE types minus 1 WQEBB, it takes the maximum hardware size of a WQE. It's made for simplicity and extensibility.
Signed-off-by: Maxim Mikityanskiy <maximmi@nvidia.com> Reviewed-by: Saeed Mahameed <saeedm@nvidia.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
show more ...
|
#
6470d2e7 |
| 29-Sep-2022 |
Maxim Mikityanskiy <maximmi@nvidia.com> |
net/mlx5e: xsk: Use KSM for unaligned XSK
UMR MTTs used in striding RQ have certain alignment requirements. While it's guaranteed to work when UMR pages are aligned to the UMR page size, in practice
net/mlx5e: xsk: Use KSM for unaligned XSK
UMR MTTs used in striding RQ have certain alignment requirements. While it's guaranteed to work when UMR pages are aligned to the UMR page size, in practice it works then UMR pages are aligned to 8 bytes. However, it's still not enough flexibility for the unaligned mode of XSK. This patch leverages KSM to map UMR pages without alignment requirements, when unaligned XSK is active. The downside is that KSM entries are twice as big as MTTs, which limits the maximum WQE size, so regular RQs and aligned XSK continue using MTTs.
Signed-off-by: Maxim Mikityanskiy <maximmi@nvidia.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
show more ...
|
#
fa557335 |
| 29-Sep-2022 |
Maxim Mikityanskiy <maximmi@nvidia.com> |
net/mlx5e: xsk: Use XSK frame size as striding RQ page size
XSK RQs support striding RQ linear mode, but the stride size is always set to PAGE_SIZE. It may be larger than the XSK frame size, unneces
net/mlx5e: xsk: Use XSK frame size as striding RQ page size
XSK RQs support striding RQ linear mode, but the stride size is always set to PAGE_SIZE. It may be larger than the XSK frame size, unnecessarily reducing the useful space in a WQE, but more importantly causing UMEM data corruption in certain cases.
Normally, stride size bigger than XSK frame size is not a problem if the hardware enforces the MTU. However, traffic between vports skips the hardware MTU check, and oversized packets may be received.
If an oversized packet is bigger than the XSK frame but not bigger than the stride, it will cause overwriting of the adjacent UMEM region. If the packet takes more than one stride, they can be recycled for reuse so it's not a problem when the XSK frame size matches the stride size.
To reduce the impact of the above issue, attempt to use the MTT page size for striding RQ that matches the XSK frame size, allowing to safely use 2048-byte frames on an up-to-date firmware.
Signed-off-by: Maxim Mikityanskiy <maximmi@nvidia.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
show more ...
|
#
e5a3cc83 |
| 29-Sep-2022 |
Maxim Mikityanskiy <maximmi@nvidia.com> |
net/mlx5e: Use runtime page_shift for striding RQ
This commit allows striding RQ to determine MTT page size at runtime, instead of sticking to the compile-time PAGE_SIZE. This functionality will be
net/mlx5e: Use runtime page_shift for striding RQ
This commit allows striding RQ to determine MTT page size at runtime, instead of sticking to the compile-time PAGE_SIZE. This functionality will be used by a following commit that adjusts the MTT page size to the XSK frame size.
Stick with PAGE_SIZE for XSK on legacy RQ, as frag_stride is not used in data path, it only helps calculate how pages are partitioned into fragments, and PAGE_SIZE will ensure each fragment starts at the beginning of a new allocation unit (XSK frame).
Signed-off-by: Maxim Mikityanskiy <maximmi@nvidia.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
show more ...
|