Revision tags: v6.6.25, v6.6.24, v6.6.23, v6.6.16, v6.6.15, v6.6.14, v6.6.13, v6.6.12, v6.6.11, v6.6.10, v6.6.9, v6.6.8, v6.6.7, v6.6.6, v6.6.5, v6.6.4, v6.6.3, v6.6.2, v6.5.11, v6.6.1, v6.5.10, v6.6, v6.5.9, v6.5.8, v6.5.7, v6.5.6, v6.5.5, v6.5.4, v6.5.3, v6.5.2, v6.1.51, v6.5.1, v6.1.50, v6.5, v6.1.49, v6.1.48, v6.1.46, v6.1.45, v6.1.44, v6.1.43, v6.1.42 |
|
#
e49bad78 |
| 25-Jul-2023 |
Krzysztof Czurylo <krzysztof.czurylo@intel.com> |
RDMA/irdma: Add table based lookup for CQ pointer during an event
Add a CQ table based loookup to allow quick search for CQ pointer having CQ ID in case of CQ related asynchrononous event. The table
RDMA/irdma: Add table based lookup for CQ pointer during an event
Add a CQ table based loookup to allow quick search for CQ pointer having CQ ID in case of CQ related asynchrononous event. The table is implemented in a similar fashion to QP table.
Also add a reference counters for CQ. This is to prevent destroying CQ while an asynchronous event is being processed.
The memory resource table size is sized higher with this update, and this table doesn't need to be physically contiguous, so use a vzalloc vs kzalloc to allocate the table.
Signed-off-by: Krzysztof Czurylo <krzysztof.czurylo@intel.com> Signed-off-by: Sindhu Devale <sindhu.devale@intel.com> Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com> Link: https://lore.kernel.org/r/20230725155505.1069-4-shiraz.saleem@intel.com Signed-off-by: Leon Romanovsky <leon@kernel.org>
show more ...
|
Revision tags: v6.1.41, v6.1.40, v6.1.39 |
|
#
f0842bb3 |
| 11-Jul-2023 |
Shiraz Saleem <shiraz.saleem@intel.com> |
RDMA/irdma: Fix data race on CQP request done
KCSAN detects a data race on cqp_request->request_done memory location which is accessed locklessly in irdma_handle_cqp_op while being updated in irdma_
RDMA/irdma: Fix data race on CQP request done
KCSAN detects a data race on cqp_request->request_done memory location which is accessed locklessly in irdma_handle_cqp_op while being updated in irdma_cqp_ce_handler.
Annotate lockless intent with READ_ONCE/WRITE_ONCE to avoid any compiler optimizations like load fusing and/or KCSAN warning.
[222808.417128] BUG: KCSAN: data-race in irdma_cqp_ce_handler [irdma] / irdma_wait_event [irdma]
[222808.417532] write to 0xffff8e44107019dc of 1 bytes by task 29658 on cpu 5: [222808.417610] irdma_cqp_ce_handler+0x21e/0x270 [irdma] [222808.417725] cqp_compl_worker+0x1b/0x20 [irdma] [222808.417827] process_one_work+0x4d1/0xa40 [222808.417835] worker_thread+0x319/0x700 [222808.417842] kthread+0x180/0x1b0 [222808.417852] ret_from_fork+0x22/0x30
[222808.417918] read to 0xffff8e44107019dc of 1 bytes by task 29688 on cpu 1: [222808.417995] irdma_wait_event+0x1e2/0x2c0 [irdma] [222808.418099] irdma_handle_cqp_op+0xae/0x170 [irdma] [222808.418202] irdma_cqp_cq_destroy_cmd+0x70/0x90 [irdma] [222808.418308] irdma_puda_dele_rsrc+0x46d/0x4d0 [irdma] [222808.418411] irdma_rt_deinit_hw+0x179/0x1d0 [irdma] [222808.418514] irdma_ib_dealloc_device+0x11/0x40 [irdma] [222808.418618] ib_dealloc_device+0x2a/0x120 [ib_core] [222808.418823] __ib_unregister_device+0xde/0x100 [ib_core] [222808.418981] ib_unregister_device+0x22/0x40 [ib_core] [222808.419142] irdma_ib_unregister_device+0x70/0x90 [irdma] [222808.419248] i40iw_close+0x6f/0xc0 [irdma] [222808.419352] i40e_client_device_unregister+0x14a/0x180 [i40e] [222808.419450] i40iw_remove+0x21/0x30 [irdma] [222808.419554] auxiliary_bus_remove+0x31/0x50 [222808.419563] device_remove+0x69/0xb0 [222808.419572] device_release_driver_internal+0x293/0x360 [222808.419582] driver_detach+0x7c/0xf0 [222808.419592] bus_remove_driver+0x8c/0x150 [222808.419600] driver_unregister+0x45/0x70 [222808.419610] auxiliary_driver_unregister+0x16/0x30 [222808.419618] irdma_exit_module+0x18/0x1e [irdma] [222808.419733] __do_sys_delete_module.constprop.0+0x1e2/0x310 [222808.419745] __x64_sys_delete_module+0x1b/0x30 [222808.419755] do_syscall_64+0x39/0x90 [222808.419763] entry_SYSCALL_64_after_hwframe+0x63/0xcd
[222808.419829] value changed: 0x01 -> 0x03
Fixes: 915cc7ac0f8e ("RDMA/irdma: Add miscellaneous utility definitions") Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com> Link: https://lore.kernel.org/r/20230711175253.1289-4-shiraz.saleem@intel.com Signed-off-by: Leon Romanovsky <leon@kernel.org>
show more ...
|
#
f2c30378 |
| 11-Jul-2023 |
Shiraz Saleem <shiraz.saleem@intel.com> |
RDMA/irdma: Fix data race on CQP completion stats
CQP completion statistics is read lockesly in irdma_wait_event and irdma_check_cqp_progress while it can be updated in the completion thread irdma_s
RDMA/irdma: Fix data race on CQP completion stats
CQP completion statistics is read lockesly in irdma_wait_event and irdma_check_cqp_progress while it can be updated in the completion thread irdma_sc_ccq_get_cqe_info on another CPU as KCSAN reports.
Make completion statistics an atomic variable to reflect coherent updates to it. This will also avoid load/store tearing logic bug potentially possible by compiler optimizations.
[77346.170861] BUG: KCSAN: data-race in irdma_handle_cqp_op [irdma] / irdma_sc_ccq_get_cqe_info [irdma]
[77346.171383] write to 0xffff8a3250b108e0 of 8 bytes by task 9544 on cpu 4: [77346.171483] irdma_sc_ccq_get_cqe_info+0x27a/0x370 [irdma] [77346.171658] irdma_cqp_ce_handler+0x164/0x270 [irdma] [77346.171835] cqp_compl_worker+0x1b/0x20 [irdma] [77346.172009] process_one_work+0x4d1/0xa40 [77346.172024] worker_thread+0x319/0x700 [77346.172037] kthread+0x180/0x1b0 [77346.172054] ret_from_fork+0x22/0x30
[77346.172136] read to 0xffff8a3250b108e0 of 8 bytes by task 9838 on cpu 2: [77346.172234] irdma_handle_cqp_op+0xf4/0x4b0 [irdma] [77346.172413] irdma_cqp_aeq_cmd+0x75/0xa0 [irdma] [77346.172592] irdma_create_aeq+0x390/0x45a [irdma] [77346.172769] irdma_rt_init_hw.cold+0x212/0x85d [irdma] [77346.172944] irdma_probe+0x54f/0x620 [irdma] [77346.173122] auxiliary_bus_probe+0x66/0xa0 [77346.173137] really_probe+0x140/0x540 [77346.173154] __driver_probe_device+0xc7/0x220 [77346.173173] driver_probe_device+0x5f/0x140 [77346.173190] __driver_attach+0xf0/0x2c0 [77346.173208] bus_for_each_dev+0xa8/0xf0 [77346.173225] driver_attach+0x29/0x30 [77346.173240] bus_add_driver+0x29c/0x2f0 [77346.173255] driver_register+0x10f/0x1a0 [77346.173272] __auxiliary_driver_register+0xbc/0x140 [77346.173287] irdma_init_module+0x55/0x1000 [irdma] [77346.173460] do_one_initcall+0x7d/0x410 [77346.173475] do_init_module+0x81/0x2c0 [77346.173491] load_module+0x1232/0x12c0 [77346.173506] __do_sys_finit_module+0x101/0x180 [77346.173522] __x64_sys_finit_module+0x3c/0x50 [77346.173538] do_syscall_64+0x39/0x90 [77346.173553] entry_SYSCALL_64_after_hwframe+0x63/0xcd
[77346.173634] value changed: 0x0000000000000094 -> 0x0000000000000095
Fixes: 915cc7ac0f8e ("RDMA/irdma: Add miscellaneous utility definitions") Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com> Link: https://lore.kernel.org/r/20230711175253.1289-3-shiraz.saleem@intel.com Signed-off-by: Leon Romanovsky <leon@kernel.org>
show more ...
|
Revision tags: v6.1.38, v6.1.37, v6.1.36, v6.4, v6.1.35, v6.1.34, v6.1.33, v6.1.32, v6.1.31, v6.1.30, v6.1.29, v6.1.28, v6.1.27, v6.1.26, v6.3, v6.1.25, v6.1.24, v6.1.23, v6.1.22, v6.1.21, v6.1.20 |
|
#
30ed9ee9 |
| 15-Mar-2023 |
Mustafa Ismail <mustafa.ismail@intel.com> |
RDMA/irdma: Do not generate SW completions for NOPs
Currently, artificial SW completions are generated for NOP wqes which can generate unexpected completions with wr_id = 0. Skip the generation of a
RDMA/irdma: Do not generate SW completions for NOPs
Currently, artificial SW completions are generated for NOP wqes which can generate unexpected completions with wr_id = 0. Skip the generation of artificial completions for NOPs.
Fixes: 81091d7696ae ("RDMA/irdma: Add SW mechanism to generate completions on error") Signed-off-by: Mustafa Ismail <mustafa.ismail@intel.com> Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com> Link: https://lore.kernel.org/r/20230315145231.931-2-shiraz.saleem@intel.com Signed-off-by: Leon Romanovsky <leon@kernel.org>
show more ...
|
#
5a711e58 |
| 15-Mar-2023 |
Krzysztof Czurylo <krzysztof.czurylo@intel.com> |
RDMA/irdma: Refactor HW statistics
Refactor HW statistics which,
- Unifies HW statistics support for all HW generations.
- Unifies support of 32- and 64-bit counters.
- Removes duplicated code an
RDMA/irdma: Refactor HW statistics
Refactor HW statistics which,
- Unifies HW statistics support for all HW generations.
- Unifies support of 32- and 64-bit counters.
- Removes duplicated code and simplifies implementation.
- Fixes roll-over handling.
- Removes unneeded last_hw_stats.
With new implementation, there is no separate handling and no separate arrays for 32- and 64-bit counters (offsets, regs, values). Instead, there is a HW stats map array for each HW revision, which defines HW-specific width and location of each counter in the statistics buffer.
Once the statistics are gathered (either via CQP op, or by reading HW registers), counter values are extracted from the statistics buffer using the stats map and the delta between the last and new values is computed. Finally, the counter values in rdma_hw_stats are incremented by those deltas.
From the OS perspective, all the counters are 64-bit and their order in rdma_hw_stats->value[] array, as well as in irdma_hw_stat_names[], is the same for all HW gens. New statistics should always be added at the end.
Signed-off-by: Krzysztof Czurylo <krzysztof.czurylo@intel.com> Signed-off-by: Youvaraj Sagar <youvaraj.sagar@intel.com> Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com> Link: https://lore.kernel.org/r/20230315145305.955-2-shiraz.saleem@intel.com Signed-off-by: Leon Romanovsky <leon@kernel.org>
show more ...
|
Revision tags: v6.1.19, v6.1.18, v6.1.17, v6.1.16, v6.1.15, v6.1.14, v6.1.13, v6.2, v6.1.12, v6.1.11, v6.1.10, v6.1.9, v6.1.8, v6.1.7, v6.1.6, v6.1.5, v6.0.19, v6.0.18, v6.1.4, v6.1.3, v6.0.17, v6.1.2, v6.0.16, v6.1.1, v6.0.15, v6.0.14, v6.0.13, v6.1, v6.0.12, v6.0.11, v6.0.10, v5.15.80, v6.0.9, v5.15.79 |
|
#
24419777 |
| 14-Nov-2022 |
Mustafa Ismail <mustafa.ismail@intel.com> |
RDMA/irdma: Fix RQ completion opcode
The opcode written by HW, in the RQ CQE, is the RoCEv2/iWARP protocol opcode from the received packet and not the SW opcode as currently assumed. Fix this by ret
RDMA/irdma: Fix RQ completion opcode
The opcode written by HW, in the RQ CQE, is the RoCEv2/iWARP protocol opcode from the received packet and not the SW opcode as currently assumed. Fix this by returning the raw operation type and queue type in the CQE to irdma_process_cqe and add 2 helpers set_ib_wc_op_sq set_ib_wc_op_rq to map IRDMA HW op types to IB op types.
Note that for iWARP, only Write with Immediate is supported so the opcode can only be IB_WC_RECV_RDMA_WITH_IMM when there is immediate data present.
Fixes: b48c24c2d710 ("RDMA/irdma: Implement device supported verb APIs") Signed-off-by: Mustafa Ismail <mustafa.ismail@intel.com> Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com> Link: https://lore.kernel.org/r/20221115011701.1379-3-shiraz.saleem@intel.com Signed-off-by: Leon Romanovsky <leon@kernel.org>
show more ...
|
Revision tags: v6.0.8, v5.15.78, v6.0.7, v5.15.77, v5.15.76, v6.0.6, v6.0.5, v5.15.75, v6.0.4, v6.0.3, v6.0.2, v5.15.74, v5.15.73, v6.0.1, v5.15.72, v6.0, v5.15.71, v5.15.70, v5.15.69, v5.15.68, v5.15.67, v5.15.66 |
|
#
7f51a961 |
| 07-Sep-2022 |
Sindhu-Devale <sindhu.devale@intel.com> |
RDMA/irdma: Align AE id codes to correct flush code and event
A number of asynchronous event (AE) ids were not aligned to the correct flush_code and event_type. Fix these up so that the correct IBV
RDMA/irdma: Align AE id codes to correct flush code and event
A number of asynchronous event (AE) ids were not aligned to the correct flush_code and event_type. Fix these up so that the correct IBV error and event codes are returned to application.
Also, add handling for new AE ids like IRDMA_AE_INVALID_REQUEST to return the correct WC error code.
Fixes: 44d9e52977a1 ("RDMA/irdma: Implement device initialization definitions") Signed-off-by: Sindhu-Devale <sindhu.devale@intel.com> Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com> Link: https://lore.kernel.org/r/20220907191324.1173-2-shiraz.saleem@intel.com Signed-off-by: Leon Romanovsky <leon@kernel.org>
show more ...
|
#
6b227bd3 |
| 06-Sep-2022 |
Sindhu-Devale <sindhu.devale@intel.com> |
RDMA/irdma: Return error on MR deregister CQP failure
The MR deregister CQP can fail if an MW is bound to it. Return an appropriate error for this case.
Fixes: b48c24c2d710 ("RDMA/irdma: Implement
RDMA/irdma: Return error on MR deregister CQP failure
The MR deregister CQP can fail if an MW is bound to it. Return an appropriate error for this case.
Fixes: b48c24c2d710 ("RDMA/irdma: Implement device supported verb APIs") Signed-off-by: Sindhu-Devale <sindhu.devale@intel.com> Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com> Link: https://lore.kernel.org/r/20220906223244.1119-3-shiraz.saleem@intel.com Signed-off-by: Leon Romanovsky <leon@kernel.org>
show more ...
|
Revision tags: v5.15.65, v5.15.64, v5.15.63 |
|
#
ead54ced |
| 24-Aug-2022 |
Shiraz Saleem <shiraz.saleem@intel.com> |
RDMA/irdma: Fix drain SQ hang with no completion
SW generated completions for outstanding WRs posted on SQ after QP is in error target the wrong CQ. This causes the ib_drain_sq to hang with no compl
RDMA/irdma: Fix drain SQ hang with no completion
SW generated completions for outstanding WRs posted on SQ after QP is in error target the wrong CQ. This causes the ib_drain_sq to hang with no completion.
Fix this to generate completions on the right CQ.
[ 863.969340] INFO: task kworker/u52:2:671 blocked for more than 122 seconds. [ 863.979224] Not tainted 5.14.0-130.el9.x86_64 #1 [ 863.986588] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 863.996997] task:kworker/u52:2 state:D stack: 0 pid: 671 ppid: 2 flags:0x00004000 [ 864.007272] Workqueue: xprtiod xprt_autoclose [sunrpc] [ 864.014056] Call Trace: [ 864.017575] __schedule+0x206/0x580 [ 864.022296] schedule+0x43/0xa0 [ 864.026736] schedule_timeout+0x115/0x150 [ 864.032185] __wait_for_common+0x93/0x1d0 [ 864.037717] ? usleep_range_state+0x90/0x90 [ 864.043368] __ib_drain_sq+0xf6/0x170 [ib_core] [ 864.049371] ? __rdma_block_iter_next+0x80/0x80 [ib_core] [ 864.056240] ib_drain_sq+0x66/0x70 [ib_core] [ 864.062003] rpcrdma_xprt_disconnect+0x82/0x3b0 [rpcrdma] [ 864.069365] ? xprt_prepare_transmit+0x5d/0xc0 [sunrpc] [ 864.076386] xprt_rdma_close+0xe/0x30 [rpcrdma] [ 864.082593] xprt_autoclose+0x52/0x100 [sunrpc] [ 864.088718] process_one_work+0x1e8/0x3c0 [ 864.094170] worker_thread+0x50/0x3b0 [ 864.099109] ? rescuer_thread+0x370/0x370 [ 864.104473] kthread+0x149/0x170 [ 864.109022] ? set_kthread_struct+0x40/0x40 [ 864.114713] ret_from_fork+0x22/0x30
Fixes: 81091d7696ae ("RDMA/irdma: Add SW mechanism to generate completions on error") Link: https://lore.kernel.org/r/20220824154358.117-1-shiraz.saleem@intel.com Reported-by: Kamal Heib <kamalheib1@gmail.com> Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com> Signed-off-by: Leon Romanovsky <leon@kernel.org>
show more ...
|
Revision tags: v5.15.62, v5.15.61, v5.15.60, v5.15.59, v5.19, v5.15.58, v5.15.57, v5.15.56, v5.15.55, v5.15.54, v5.15.53 |
|
#
36a26d12 |
| 05-Jul-2022 |
Mustafa Ismail <mustafa.ismail@intel.com> |
RDMA/irdma: Make CQP invalid state error non-critical
The invalid state error returned by the Control Queue-Pair (CQP) is not a critical error.
Add it to the irdma_noncrit_err_list and drop reporti
RDMA/irdma: Make CQP invalid state error non-critical
The invalid state error returned by the Control Queue-Pair (CQP) is not a critical error.
Add it to the irdma_noncrit_err_list and drop reporting it as device error message.
Link: https://lore.kernel.org/r/20220705230815.265-4-shiraz.saleem@intel.com Signed-off-by: Mustafa Ismail <mustafa.ismail@intel.com> Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com> Signed-off-by: Leon Romanovsky <leon@kernel.org>
show more ...
|
Revision tags: v5.15.52, v5.15.51, v5.15.50, v5.15.49, v5.15.48, v5.15.47, v5.15.46, v5.15.45, v5.15.44, v5.15.43, v5.15.42, v5.18, v5.15.41, v5.15.40, v5.15.39, v5.15.38, v5.15.37, v5.15.36 |
|
#
81091d76 |
| 25-Apr-2022 |
Mustafa Ismail <mustafa.ismail@intel.com> |
RDMA/irdma: Add SW mechanism to generate completions on error
HW flushes after QP in error state is not reliable. This can lead to application hang waiting on a completion for outstanding WRs. I
RDMA/irdma: Add SW mechanism to generate completions on error
HW flushes after QP in error state is not reliable. This can lead to application hang waiting on a completion for outstanding WRs. Implement a SW mechanism to generate completions for any outstanding WR's after the QP is modified to error.
This is accomplished by starting a delayed worker after the QP is modified to error and the HW flush is performed. The worker will generate completions that will be returned to the application when it polls the CQ. This mechanism only applies to Kernel applications.
Link: https://lore.kernel.org/r/20220425181624.1617-1-shiraz.saleem@intel.com Signed-off-by: Mustafa Ismail <mustafa.ismail@intel.com> Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
show more ...
|
#
1c9043ae |
| 25-Apr-2022 |
Mustafa Ismail <mustafa.ismail@intel.com> |
RDMA/irdma: Fix possible crash due to NULL netdev in notifier
For some net events in irdma_net_event notifier, the netdev can be NULL which will cause a crash in rdma_vlan_dev_real_dev. Fix this by
RDMA/irdma: Fix possible crash due to NULL netdev in notifier
For some net events in irdma_net_event notifier, the netdev can be NULL which will cause a crash in rdma_vlan_dev_real_dev. Fix this by moving all processing to the NETEVENT_NEIGH_UPDATE case where the netdev is guaranteed to not be NULL.
Fixes: 6702bc147448 ("RDMA/irdma: Fix netdev notifications for vlan's") Link: https://lore.kernel.org/r/20220425181703.1634-4-shiraz.saleem@intel.com Signed-off-by: Mustafa Ismail <mustafa.ismail@intel.com> Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
show more ...
|
Revision tags: v5.15.35, v5.15.34, v5.15.33, v5.15.32, v5.15.31, v5.17, v5.15.30, v5.15.29, v5.15.28, v5.15.27, v5.15.26 |
|
#
6702bc14 |
| 25-Feb-2022 |
Mustafa Ismail <mustafa.ismail@intel.com> |
RDMA/irdma: Fix netdev notifications for vlan's
Currently, events on vlan netdevs are being ignored. Fix this by finding the real netdev and processing the notifications for vlan netdevs.
Fixes: 91
RDMA/irdma: Fix netdev notifications for vlan's
Currently, events on vlan netdevs are being ignored. Fix this by finding the real netdev and processing the notifications for vlan netdevs.
Fixes: 915cc7ac0f8e ("RDMA/irdma: Add miscellaneous utility definitions") Link: https://lore.kernel.org/r/20220225163211.127-2-shiraz.saleem@intel.com Signed-off-by: Mustafa Ismail <mustafa.ismail@intel.com> Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com> Reviewed-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
show more ...
|
Revision tags: v5.15.25 |
|
#
2c4b14ea |
| 17-Feb-2022 |
Shiraz Saleem <shiraz.saleem@intel.com> |
RDMA/irdma: Remove enum irdma_status_code
Replace use of custom irdma_status_code with linux error codes.
Remove enum irdma_status_code and header in which its defined.
Link: https://lore.kernel.o
RDMA/irdma: Remove enum irdma_status_code
Replace use of custom irdma_status_code with linux error codes.
Remove enum irdma_status_code and header in which its defined.
Link: https://lore.kernel.org/r/20220217151851.1518-2-shiraz.saleem@intel.com Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
show more ...
|
Revision tags: v5.15.24, v5.15.23, v5.15.22, v5.15.21, v5.15.20, v5.15.19, v5.15.18, v5.15.17, v5.4.173, v5.15.16, v5.15.15, v5.16, v5.15.10, v5.15.9, v5.15.8, v5.15.7 |
|
#
10467ce0 |
| 01-Dec-2021 |
Tatyana Nikolova <tatyana.e.nikolova@intel.com> |
RDMA/irdma: Don't arm the CQ more than two times if no CE for this CQ
Completion events (CEs) are lost if the application is allowed to arm the CQ more than two times when no new CE for this CQ has
RDMA/irdma: Don't arm the CQ more than two times if no CE for this CQ
Completion events (CEs) are lost if the application is allowed to arm the CQ more than two times when no new CE for this CQ has been generated by the HW.
Check if arming has been done for the CQ and if not, arm the CQ for any event otherwise promote to arm the CQ for any event only when the last arm event was solicited.
Fixes: b48c24c2d710 ("RDMA/irdma: Implement device supported verb APIs") Link: https://lore.kernel.org/r/20211201231509.1930-2-shiraz.saleem@intel.com Signed-off-by: Tatyana Nikolova <tatyana.e.nikolova@intel.com> Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
show more ...
|
#
117697cc |
| 05-Dec-2021 |
Christophe JAILLET <christophe.jaillet@wanadoo.fr> |
RDMA/irdma: Fix a potential memory allocation issue in 'irdma_prm_add_pble_mem()'
'pchunk->bitmapbuf' is a bitmap. Its size (in number of bits) is stored in 'pchunk->sizeofbitmap'.
When it is alloc
RDMA/irdma: Fix a potential memory allocation issue in 'irdma_prm_add_pble_mem()'
'pchunk->bitmapbuf' is a bitmap. Its size (in number of bits) is stored in 'pchunk->sizeofbitmap'.
When it is allocated, the size (in bytes) is computed by: size_in_bits >> 3
There are 2 issues (numbers bellow assume that longs are 64 bits): - there is no guarantee here that 'pchunk->bitmapmem.size' is modulo BITS_PER_LONG but bitmaps are stored as longs (sizeofbitmap=8 bits will only allocate 1 byte, instead of 8 (1 long))
- the number of bytes is computed with a shift, not a round up, so we may allocate less memory than needed (sizeofbitmap=65 bits will only allocate 8 bytes (i.e. 1 long), when 2 longs are needed = 16 bytes)
Fix both issues by using 'bitmap_zalloc()' and remove the useless 'bitmapmem' from 'struct irdma_chunk'.
While at it, remove some useless NULL test before calling kfree/bitmap_free.
Fixes: 915cc7ac0f8e ("RDMA/irdma: Add miscellaneous utility definitions") Link: https://lore.kernel.org/r/5e670b640508e14b1869c3e8e4fb970d78cbe997.1638692171.git.christophe.jaillet@wanadoo.fr Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Reviewed-by: Shiraz Saleem <shiraz.saleem@intel.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
show more ...
|
Revision tags: v5.15.6, v5.15.5, v5.15.4, v5.15.3, v5.15.2, v5.15.1, v5.15, v5.14.14 |
|
#
fd92213e |
| 19-Oct-2021 |
Jakub Kicinski <kuba@kernel.org> |
RDMA: Constify netdev->dev_addr accesses
netdev->dev_addr will become const soon, make sure drivers propagate the qualifier.
Link: https://lore.kernel.org/r/20211019182604.1441387-4-kuba@kernel.org
RDMA: Constify netdev->dev_addr accesses
netdev->dev_addr will become const soon, make sure drivers propagate the qualifier.
Link: https://lore.kernel.org/r/20211019182604.1441387-4-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org> Reviewed-by: Leon Romanovsky <leonro@nvidia.com> Acked-by: Dennis Dalessandro <dennis.dalessandro@cornelisnetworks.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
show more ...
|
Revision tags: v5.14.13, v5.14.12 |
|
#
9d8f247c |
| 11-Oct-2021 |
Zhu Yanjun <yanjun.zhu@linux.dev> |
RDMA/irdma: Remove irdma_cqp_up_map_cmd()
The function irdma_cqp_up_map_cmd() is not used. So remove it.
Link: https://lore.kernel.org/r/20211011110128.4057-5-yanjun.zhu@linux.dev Signed-off-by: Zh
RDMA/irdma: Remove irdma_cqp_up_map_cmd()
The function irdma_cqp_up_map_cmd() is not used. So remove it.
Link: https://lore.kernel.org/r/20211011110128.4057-5-yanjun.zhu@linux.dev Signed-off-by: Zhu Yanjun <yanjun.zhu@linux.dev> Acked-by: Shiraz Saleem <shiraz.saleem@intel.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
show more ...
|
#
16ddcfca |
| 11-Oct-2021 |
Zhu Yanjun <yanjun.zhu@linux.dev> |
RDMA/irdma: Remove irdma_get_hw_addr()
The function irdma_get_hw_addr() is not used. So remove it.
Link: https://lore.kernel.org/r/20211011110128.4057-4-yanjun.zhu@linux.dev Signed-off-by: Zhu Yanj
RDMA/irdma: Remove irdma_get_hw_addr()
The function irdma_get_hw_addr() is not used. So remove it.
Link: https://lore.kernel.org/r/20211011110128.4057-4-yanjun.zhu@linux.dev Signed-off-by: Zhu Yanjun <yanjun.zhu@linux.dev> Acked-by: Shiraz Saleem <shiraz.saleem@intel.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
show more ...
|
#
dc9e4ef6 |
| 07-Sep-2022 |
Sindhu-Devale <sindhu.devale@intel.com> |
RDMA/irdma: Align AE id codes to correct flush code and event
[ Upstream commit 7f51a961f8c6b84752a48e950074a8c4a0808d91 ]
A number of asynchronous event (AE) ids were not aligned to the correct fl
RDMA/irdma: Align AE id codes to correct flush code and event
[ Upstream commit 7f51a961f8c6b84752a48e950074a8c4a0808d91 ]
A number of asynchronous event (AE) ids were not aligned to the correct flush_code and event_type. Fix these up so that the correct IBV error and event codes are returned to application.
Also, add handling for new AE ids like IRDMA_AE_INVALID_REQUEST to return the correct WC error code.
Fixes: 44d9e52977a1 ("RDMA/irdma: Implement device initialization definitions") Signed-off-by: Sindhu-Devale <sindhu.devale@intel.com> Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com> Link: https://lore.kernel.org/r/20220907191324.1173-2-shiraz.saleem@intel.com Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
show more ...
|
#
a61a5582 |
| 25-Apr-2022 |
Mustafa Ismail <mustafa.ismail@intel.com> |
RDMA/irdma: Fix possible crash due to NULL netdev in notifier
commit 1c9043ae0667a43bd87beeebbdd4bed674713629 upstream.
For some net events in irdma_net_event notifier, the netdev can be NULL which
RDMA/irdma: Fix possible crash due to NULL netdev in notifier
commit 1c9043ae0667a43bd87beeebbdd4bed674713629 upstream.
For some net events in irdma_net_event notifier, the netdev can be NULL which will cause a crash in rdma_vlan_dev_real_dev. Fix this by moving all processing to the NETEVENT_NEIGH_UPDATE case where the netdev is guaranteed to not be NULL.
Fixes: 6702bc147448 ("RDMA/irdma: Fix netdev notifications for vlan's") Link: https://lore.kernel.org/r/20220425181703.1634-4-shiraz.saleem@intel.com Signed-off-by: Mustafa Ismail <mustafa.ismail@intel.com> Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
show more ...
|
#
5e96bb45 |
| 25-Feb-2022 |
Mustafa Ismail <mustafa.ismail@intel.com> |
RDMA/irdma: Fix netdev notifications for vlan's
[ Upstream commit 6702bc14744847842a87fed21a795b6e8bab6965 ]
Currently, events on vlan netdevs are being ignored. Fix this by finding the real netdev
RDMA/irdma: Fix netdev notifications for vlan's
[ Upstream commit 6702bc14744847842a87fed21a795b6e8bab6965 ]
Currently, events on vlan netdevs are being ignored. Fix this by finding the real netdev and processing the notifications for vlan netdevs.
Fixes: 915cc7ac0f8e ("RDMA/irdma: Add miscellaneous utility definitions") Link: https://lore.kernel.org/r/20220225163211.127-2-shiraz.saleem@intel.com Signed-off-by: Mustafa Ismail <mustafa.ismail@intel.com> Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com> Reviewed-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
show more ...
|
#
76db1531 |
| 01-Dec-2021 |
Tatyana Nikolova <tatyana.e.nikolova@intel.com> |
RDMA/irdma: Don't arm the CQ more than two times if no CE for this CQ
[ Upstream commit 10467ce09fefa2e74359f5b2ab1efb8909402f19 ]
Completion events (CEs) are lost if the application is allowed to
RDMA/irdma: Don't arm the CQ more than two times if no CE for this CQ
[ Upstream commit 10467ce09fefa2e74359f5b2ab1efb8909402f19 ]
Completion events (CEs) are lost if the application is allowed to arm the CQ more than two times when no new CE for this CQ has been generated by the HW.
Check if arming has been done for the CQ and if not, arm the CQ for any event otherwise promote to arm the CQ for any event only when the last arm event was solicited.
Fixes: b48c24c2d710 ("RDMA/irdma: Implement device supported verb APIs") Link: https://lore.kernel.org/r/20211201231509.1930-2-shiraz.saleem@intel.com Signed-off-by: Tatyana Nikolova <tatyana.e.nikolova@intel.com> Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
show more ...
|
#
b260dfed |
| 05-Dec-2021 |
Christophe JAILLET <christophe.jaillet@wanadoo.fr> |
RDMA/irdma: Fix a potential memory allocation issue in 'irdma_prm_add_pble_mem()'
[ Upstream commit 117697cc935b0ab04ec66274d8e64ccfebd7d0d2 ]
'pchunk->bitmapbuf' is a bitmap. Its size (in number o
RDMA/irdma: Fix a potential memory allocation issue in 'irdma_prm_add_pble_mem()'
[ Upstream commit 117697cc935b0ab04ec66274d8e64ccfebd7d0d2 ]
'pchunk->bitmapbuf' is a bitmap. Its size (in number of bits) is stored in 'pchunk->sizeofbitmap'.
When it is allocated, the size (in bytes) is computed by: size_in_bits >> 3
There are 2 issues (numbers bellow assume that longs are 64 bits): - there is no guarantee here that 'pchunk->bitmapmem.size' is modulo BITS_PER_LONG but bitmaps are stored as longs (sizeofbitmap=8 bits will only allocate 1 byte, instead of 8 (1 long))
- the number of bytes is computed with a shift, not a round up, so we may allocate less memory than needed (sizeofbitmap=65 bits will only allocate 8 bytes (i.e. 1 long), when 2 longs are needed = 16 bytes)
Fix both issues by using 'bitmap_zalloc()' and remove the useless 'bitmapmem' from 'struct irdma_chunk'.
While at it, remove some useless NULL test before calling kfree/bitmap_free.
Fixes: 915cc7ac0f8e ("RDMA/irdma: Add miscellaneous utility definitions") Link: https://lore.kernel.org/r/5e670b640508e14b1869c3e8e4fb970d78cbe997.1638692171.git.christophe.jaillet@wanadoo.fr Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Reviewed-by: Shiraz Saleem <shiraz.saleem@intel.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
show more ...
|
Revision tags: v5.14.11, v5.14.10, v5.14.9, v5.14.8, v5.14.7, v5.14.6, v5.10.67 |
|
#
5b1e985f |
| 16-Sep-2021 |
Sindhu Devale <sindhu.devale@intel.com> |
RDMA/irdma: Skip CQP ring during a reset
Due to duplicate reset flags, CQP commands are processed during reset.
This leads CQP failures such as below:
irdma0: [Delete Local MAC Entry Cmd Error][o
RDMA/irdma: Skip CQP ring during a reset
Due to duplicate reset flags, CQP commands are processed during reset.
This leads CQP failures such as below:
irdma0: [Delete Local MAC Entry Cmd Error][op_code=49] status=-27 waiting=1 completion_err=0 maj=0x0 min=0x0
Remove the redundant flag and set the correct reset flag so CPQ is paused during reset
Fixes: 8498a30e1b94 ("RDMA/irdma: Register auxiliary driver and implement private channel OPs") Link: https://lore.kernel.org/r/20210916191222.824-2-shiraz.saleem@intel.com Reported-by: LiLiang <liali@redhat.com> Signed-off-by: Sindhu Devale <sindhu.devale@intel.com> Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
show more ...
|