Revision tags: v5.15.41, v5.15.40, v5.15.39, v5.15.38, v5.15.37, v5.15.36, v5.15.35, v5.15.34, v5.15.33, v5.15.32, v5.15.31, v5.15.30, v5.15.29, v5.15.28, v5.15.27, v5.15.26, v5.15.25, v5.15.24, v5.15.23, v5.15.22, v5.15.21, v5.15.20, v5.15.19, v5.15.18, v5.15.17, v5.4.173, v5.15.16, v5.15.15, v5.15.10, v5.15.9, v5.15.8, v5.15.7, v5.15.6, v5.15.5, v5.15.4, v5.15.3, v5.15.2, v5.15.1, v5.15, v5.14.14, v5.14.13, v5.14.12, v5.14.11, v5.14.10, v5.14.9, v5.14.8, v5.14.7, v5.14.6, v5.10.67, v5.10.66, v5.14.5, v5.14.4, v5.10.65, v5.14.3, v5.10.64, v5.14.2, v5.10.63, v5.14.1, v5.10.62, v5.14, v5.10.61, v5.10.60, v5.10.53 |
|
#
44da3730 |
| 23-Jul-2021 |
Leon Romanovsky <leonro@nvidia.com> |
RDMA/rdmavt: Decouple QP and SGE lists allocations
The rdmavt QP has fields that are both needed for the control and data path. Such mixed declaration caused to the very specific allocation flow wit
RDMA/rdmavt: Decouple QP and SGE lists allocations
The rdmavt QP has fields that are both needed for the control and data path. Such mixed declaration caused to the very specific allocation flow with kzalloc_node and SGE list embedded into the struct rvt_qp.
This patch separates QP creation to two: regular memory allocation for the control path and specific code for the SGE list, while the access to the later is performed through derefenced pointer.
Such pointer and its context are expected to be in the cache, so performance difference is expected to be negligible, if any exists.
Link: https://lore.kernel.org/r/f66c1e20ccefba0db3c69c58ca9c897f062b4d1c.1627040189.git.leonro@nvidia.com Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
show more ...
|
Revision tags: v5.10.52, v5.10.51, v5.10.50, v5.10.49, v5.13, v5.10.46, v5.10.43, v5.10.42, v5.10.41, v5.10.40, v5.10.39, v5.4.119, v5.10.36, v5.10.35, v5.10.34, v5.4.116, v5.10.33, v5.12, v5.10.32, v5.10.31, v5.10.30, v5.10.27, v5.10.26, v5.10.25, v5.10.24, v5.10.23, v5.10.22, v5.10.21, v5.10.20, v5.10.19, v5.4.101, v5.10.18, v5.10.17, v5.11, v5.10.16, v5.10.15, v5.10.14, v5.10, v5.8.17, v5.8.16, v5.8.15, v5.9, v5.8.14, v5.8.13, v5.8.12, v5.8.11, v5.8.10, v5.8.9, v5.8.8, v5.8.7, v5.8.6, v5.4.62, v5.8.5, v5.8.4, v5.4.61, v5.8.3, v5.4.60, v5.8.2, v5.4.59, v5.8.1, v5.4.58, v5.4.57, v5.4.56, v5.8, v5.7.12, v5.4.55, v5.7.11, v5.4.54 |
|
#
54a485e9 |
| 28-Jul-2020 |
Mike Marciniszyn <mike.marciniszyn@intel.com> |
IB/rdmavt: Fix RQ counting issues causing use of an invalid RWQE
The lookaside count is improperly initialized to the size of the Receive Queue with the additional +1. In the traces below, the RQ s
IB/rdmavt: Fix RQ counting issues causing use of an invalid RWQE
The lookaside count is improperly initialized to the size of the Receive Queue with the additional +1. In the traces below, the RQ size is 384, so the count was set to 385.
The lookaside count is then rarely refreshed. Note the high and incorrect count in the trace below:
rvt_get_rwqe: [hfi1_0] wqe ffffc900078e9008 wr_id 55c7206d75a0 qpn c qpt 2 pid 3018 num_sge 1 head 1 tail 0, count 385 rvt_get_rwqe: (hfi1_rc_rcv+0x4eb/0x1480 [hfi1] <- rvt_get_rwqe) ret=0x1
The head,tail indicate there is only one RWQE posted although the count says 385 and we correctly return the element 0.
The next call to rvt_get_rwqe with the decremented count:
rvt_get_rwqe: [hfi1_0] wqe ffffc900078e9058 wr_id 0 qpn c qpt 2 pid 3018 num_sge 0 head 1 tail 1, count 384 rvt_get_rwqe: (hfi1_rc_rcv+0x4eb/0x1480 [hfi1] <- rvt_get_rwqe) ret=0x1
Note that the RQ is empty (head == tail) yet we return the RWQE at tail 1, which is not valid because of the bogus high count.
Best case, the RWQE has never been posted and the rc logic sees an RWQE that is too small (all zeros) and puts the QP into an error state.
In the worst case, a server slow at posting receive buffers might fool rvt_get_rwqe() into fetching an old RWQE and corrupt memory.
Fix by deleting the faulty initialization code and creating an inline to fetch the posted count and convert all callers to use new inline.
Fixes: f592ae3c999f ("IB/rdmavt: Fracture single lock used for posting and processing RWQEs") Link: https://lore.kernel.org/r/20200728183848.22226.29132.stgit@awfm-01.aw.intel.com Reported-by: Zhaojuan Guo <zguo@redhat.com> Cc: <stable@vger.kernel.org> # 5.4.x Reviewed-by: Kaike Wan <kaike.wan@intel.com> Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Tested-by: Honggang Li <honli@redhat.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
show more ...
|
Revision tags: v5.7.10, v5.4.53 |
|
#
6bf9d8f6 |
| 19-Jul-2020 |
Leon Romanovsky <leonro@mellanox.com> |
RDMA/include: Replace license text with SPDX tags
The header files in RDMA subsystem are dual licensed and can be described by simple SPDX tag, so replace all of them at once together with making th
RDMA/include: Replace license text with SPDX tags
The header files in RDMA subsystem are dual licensed and can be described by simple SPDX tag, so replace all of them at once together with making them use the same coding style for header guard defines.
Link: https://lore.kernel.org/r/20200719072521.135260-1-leon@kernel.org Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
show more ...
|
Revision tags: v5.4.52, v5.7.9, v5.7.8, v5.4.51, v5.4.50, v5.7.7, v5.4.49, v5.7.6, v5.7.5, v5.4.48, v5.7.4, v5.7.3, v5.4.47, v5.4.46, v5.7.2, v5.4.45, v5.7.1, v5.4.44, v5.7, v5.4.43, v5.4.42, v5.4.41 |
|
#
84e3b19a |
| 11-May-2020 |
Gary Leshner <Gary.S.Leshner@intel.com> |
IB/hfi1: Remove module parameter for KDETH qpns
The module parameter for KDETH qpns is being removed in favor of always using the default value of 0x80 as the qpn prefix. Defines have been added for
IB/hfi1: Remove module parameter for KDETH qpns
The module parameter for KDETH qpns is being removed in favor of always using the default value of 0x80 as the qpn prefix. Defines have been added for various KDETH values including the prefix of 0x80. The reserved range now starts at the base value for KDETH qpns (0x80) and extends up to and including the last qpn for other reserved QP prefixed types. Adjust other QP prefixed define names to match KDETH defined names.
Link: https://lore.kernel.org/r/20200511160600.173205.27508.stgit@awfm-01.aw.intel.com Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Gary Leshner <Gary.S.Leshner@intel.com> Signed-off-by: Kaike Wan <kaike.wan@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
show more ...
|
Revision tags: v5.4.40 |
|
#
0cb9e4f9 |
| 07-May-2020 |
Gustavo A. R. Silva <gustavoars@kernel.org> |
IB/rdmavt: Replace zero-length array with flexible-array
The current codebase makes use of the zero-length array language extension to the C90 standard, but the preferred mechanism to declare variab
IB/rdmavt: Replace zero-length array with flexible-array
The current codebase makes use of the zero-length array language extension to the C90 standard, but the preferred mechanism to declare variable-length types such as these ones is a flexible array member[1][2], introduced in C99:
struct foo { int stuff; struct boo array[]; };
By making use of the mechanism above, we will get a compiler warning in case the flexible array does not occur last in the structure, which will help us prevent some kind of undefined behavior bugs from being inadvertently introduced[3] to the codebase from now on.
Also, notice that, dynamic memory allocations won't be affected by this change:
"Flexible array members have incomplete type, and so the sizeof operator may not be applied. As a quirk of the original implementation of zero-length arrays, sizeof evaluates to zero."[1]
sizeof(flexible-array-member) triggers a warning because flexible array members have incomplete type[1]. There are some instances of code in which the sizeof operator is being incorrectly/erroneously applied to zero-length arrays and the result is zero. Such instances may be hiding some bugs. So, this work (flexible-array member conversions) will also help to get completely rid of those sorts of issues.
This issue was found with the help of Coccinelle.
[1] https://gcc.gnu.org/onlinedocs/gcc/Zero-Length.html [2] https://github.com/KSPP/linux/issues/21 [3] commit 76497732932f ("cxgb3/l2t: Fix undefined behaviour")
Link: https://lore.kernel.org/r/20200507185342.GA14476@embeddedor Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
show more ...
|
Revision tags: v5.4.39, v5.4.38, v5.4.37, v5.4.36, v5.4.35, v5.4.34, v5.4.33, v5.4.32, v5.4.31, v5.4.30, v5.4.29, v5.6, v5.4.28, v5.4.27, v5.4.26, v5.4.25, v5.4.24, v5.4.23, v5.4.22, v5.4.21, v5.4.20 |
|
#
5b361328 |
| 12-Feb-2020 |
Gustavo A. R. Silva <gustavo@embeddedor.com> |
RDMA: Replace zero-length array with flexible-array member
The current codebase makes use of the zero-length array language extension to the C90 standard, but the preferred mechanism to declare vari
RDMA: Replace zero-length array with flexible-array member
The current codebase makes use of the zero-length array language extension to the C90 standard, but the preferred mechanism to declare variable-length types such as these ones is a flexible array member[1][2], introduced in C99:
struct foo { int stuff; struct boo array[]; };
By making use of the mechanism above, we will get a compiler warning in case the flexible array does not occur last in the structure, which will help us prevent some kind of undefined behavior bugs from being inadvertently introduced[3] to the codebase from now on.
Also, notice that, dynamic memory allocations won't be affected by this change:
"Flexible array members have incomplete type, and so the sizeof operator may not be applied. As a quirk of the original implementation of zero-length arrays, sizeof evaluates to zero."[1]
This issue was found with the help of Coccinelle.
[1] https://gcc.gnu.org/onlinedocs/gcc/Zero-Length.html [2] https://github.com/KSPP/linux/issues/21 [3] commit 76497732932f ("cxgb3/l2t: Fix undefined behaviour")
Link: https://lore.kernel.org/r/20200213010425.GA13068@embeddedor.com Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> # added a few more
show more ...
|
Revision tags: v5.4.19, v5.4.18, v5.4.17, v5.4.16, v5.5, v5.4.15, v5.4.14, v5.4.13, v5.4.12, v5.4.11, v5.4.10, v5.4.9, v5.4.8, v5.4.7, v5.4.6 |
|
#
4ad6429d |
| 19-Dec-2019 |
Mike Marciniszyn <mike.marciniszyn@intel.com> |
IB/rdmavt: Correct comments in rdmavt_qp.h header
Comments need to be with the definition of rvt_restart_sge().
Other comments were duplicated in sw/rdmavt/rc.c and were removed.
Fixes: 385156c5f2
IB/rdmavt: Correct comments in rdmavt_qp.h header
Comments need to be with the definition of rvt_restart_sge().
Other comments were duplicated in sw/rdmavt/rc.c and were removed.
Fixes: 385156c5f2a6 ("IB/hfi: Move RC functions into a header file") Link: https://lore.kernel.org/r/20191219211934.58387.88014.stgit@awfm-01.aw.intel.com Reviewed-by: Kaike Wan <kaike.wan@intel.com> Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
show more ...
|
Revision tags: v5.4.5, v5.4.4, v5.4.3, v5.3.15, v5.4.2, v5.4.1, v5.3.14, v5.4, v5.3.13, v5.3.12, v5.3.11, v5.3.10, v5.3.9, v5.3.8, v5.3.7, v5.3.6, v5.3.5, v5.3.4, v5.3.3, v5.3.2, v5.3.1, v5.3 |
|
#
71994354 |
| 11-Sep-2019 |
Kaike Wan <kaike.wan@intel.com> |
IB/{rdmavt, hfi1, qib}: Add a counter for credit waits
This patch adds a counter for credit waits to assist field debugging.
Link: https://lore.kernel.org/r/20190911113047.126040.10857.stgit@awfm-0
IB/{rdmavt, hfi1, qib}: Add a counter for credit waits
This patch adds a counter for credit waits to assist field debugging.
Link: https://lore.kernel.org/r/20190911113047.126040.10857.stgit@awfm-01.aw.intel.com Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Kaike Wan <kaike.wan@intel.com> Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
show more ...
|
Revision tags: v5.2.14, v5.3-rc8, v5.2.13, v5.2.12, v5.2.11, v5.2.10, v5.2.9, v5.2.8, v5.2.7, v5.2.6, v5.2.5, v5.2.4, v5.2.3, v5.2.2 |
|
#
2b74c878 |
| 15-Jul-2019 |
Kaike Wan <kaike.wan@intel.com> |
IB/hfi1: Unreserve a flushed OPFN request
When an OPFN request is flushed, the request is completed without unreserving itself from the send queue. Subsequently, when a new request is post sent, the
IB/hfi1: Unreserve a flushed OPFN request
When an OPFN request is flushed, the request is completed without unreserving itself from the send queue. Subsequently, when a new request is post sent, the following warning will be triggered:
WARNING: CPU: 4 PID: 8130 at rdmavt/qp.c:1761 rvt_post_send+0x72a/0x880 [rdmavt] Call Trace: [<ffffffffbbb61e41>] dump_stack+0x19/0x1b [<ffffffffbb497688>] __warn+0xd8/0x100 [<ffffffffbb4977cd>] warn_slowpath_null+0x1d/0x20 [<ffffffffc01c941a>] rvt_post_send+0x72a/0x880 [rdmavt] [<ffffffffbb4dcabe>] ? account_entity_dequeue+0xae/0xd0 [<ffffffffbb61d645>] ? __kmalloc+0x55/0x230 [<ffffffffc04e1a4c>] ib_uverbs_post_send+0x37c/0x5d0 [ib_uverbs] [<ffffffffc04e5e36>] ? rdma_lookup_put_uobject+0x26/0x60 [ib_uverbs] [<ffffffffc04dbce6>] ib_uverbs_write+0x286/0x460 [ib_uverbs] [<ffffffffbb6f9457>] ? security_file_permission+0x27/0xa0 [<ffffffffbb641650>] vfs_write+0xc0/0x1f0 [<ffffffffbb64246f>] SyS_write+0x7f/0xf0 [<ffffffffbbb74ddb>] system_call_fastpath+0x22/0x27
This patch fixes the problem by moving rvt_qp_wqe_unreserve() into rvt_qp_complete_swqe() to simplify the code and make it less error-prone.
Fixes: ca95f802ef51 ("IB/hfi1: Unreserve a reserved request when it is completed") Link: https://lore.kernel.org/r/20190715164528.74174.31364.stgit@awfm-01.aw.intel.com Cc: <stable@vger.kernel.org> Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Kaike Wan <kaike.wan@intel.com> Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
show more ...
|
Revision tags: v5.2.1, v5.2, v5.1.16 |
|
#
2b0ad2da |
| 28-Jun-2019 |
Michael J. Ruhl <michael.j.ruhl@intel.com> |
IB/{rdmavt, hfi1, qib}: Add helpers to hide SWQE WR details
Add some helper functions to hide struct rvt_swqe details.
Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Mich
IB/{rdmavt, hfi1, qib}: Add helpers to hide SWQE WR details
Add some helper functions to hide struct rvt_swqe details.
Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Michael J. Ruhl <michael.j.ruhl@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
show more ...
|
#
d310c4bf |
| 28-Jun-2019 |
Michael J. Ruhl <michael.j.ruhl@intel.com> |
IB/{rdmavt, hfi1, qib}: Remove AH refcount for UD QPs
Historically rdmavt destroy_ah() has returned an -EBUSY when the AH has a non-zero reference count. IBTA 11.2.2 notes no such return value or e
IB/{rdmavt, hfi1, qib}: Remove AH refcount for UD QPs
Historically rdmavt destroy_ah() has returned an -EBUSY when the AH has a non-zero reference count. IBTA 11.2.2 notes no such return value or error case:
Output Modifiers: - Verb results: - Operation completed successfully. - Invalid HCA handle. - Invalid address handle.
ULPs never test for this error and this will leak memory.
The reference count exists to allow for driver independent progress mechanisms to process UD SWQEs in parallel with post sends. The SWQE will hold a reference count until the UD SWQE completes and then drops the reference.
Fix by removing need to reference count the AH. Add a UD specific allocation to each SWQE entry to cache the necessary information for independent progress. Copy the information during the post send processing.
Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Michael J. Ruhl <michael.j.ruhl@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
show more ...
|
#
5136bfea |
| 28-Jun-2019 |
Kamenee Arumugam <kamenee.arumugam@intel.com> |
IB/{hfi1, qib, rdmavt}: Put qp in error state when cq is full
When a completion queue is full, the associated queue pairs are not put into the error state. According to the IBTA specification, this
IB/{hfi1, qib, rdmavt}: Put qp in error state when cq is full
When a completion queue is full, the associated queue pairs are not put into the error state. According to the IBTA specification, this is a violation.
Quote from IBTA spec: C9-218: A Requester Class F error occurs when the CQ is inaccessible or full and an attempt is made to complete a WQE. The Affected QP shall be moved to the error state and affiliated asynchronous errors generated as described in 11.6.3.1 Affiliated Asynchronous Events on page 678. The current WQE and any subsequent WQEs are left in an unknown state.
C11-37: The CI shall generate a CQ Error when a CQ overrun is detected. This condition will result in an Affiliated Asynchronous Error for any associated Work Queues when they attempt to use that CQ. Completions can no longer be added to the CQ. It is not guaranteed that completions present in the CQ at the time the error occurred can be retrieved. Possible causes include a CQ overrun or a CQ protection error.
Put the qp in error state when cq is full. Implement a state called full to continue to put other associated QPs in error state.
Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Reviewed-by: Michael J. Ruhl <michael.j.ruhl@intel.com> Signed-off-by: Kamenee Arumugam <kamenee.arumugam@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
show more ...
|
#
f592ae3c |
| 28-Jun-2019 |
Kamenee Arumugam <kamenee.arumugam@intel.com> |
IB/rdmavt: Fracture single lock used for posting and processing RWQEs
Usage of single lock prevents fetching posted and processing receive work queue entries from progressing simultaneously and impa
IB/rdmavt: Fracture single lock used for posting and processing RWQEs
Usage of single lock prevents fetching posted and processing receive work queue entries from progressing simultaneously and impacts overall performance.
Fracture the single lock used for posting and processing Receive Work Queue Entries (RWQEs) to allow the circular buffer to be filled and emptied at the same time. Two new spinlocks - one for the producers and one for the consumers used for posting and processing RWQEs simultaneously and the two indices are define on two different cache lines. The threshold count is used to avoid reading other index in different cache line every time.
Signed-off-by: Harish Chegondi <harish.chegondi@intel.com> Signed-off-by: Kamenee Arumugam <kamenee.arumugam@intel.com> Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
show more ...
|
#
dabac6e4 |
| 28-Jun-2019 |
Kamenee Arumugam <kamenee.arumugam@intel.com> |
IB/hfi1: Move receive work queue struct into uapi directory
The rvt_rwqe and rvt_rwq struct elements are shared between rdmavt and the providers but are not in uapi directory. As per the comment in
IB/hfi1: Move receive work queue struct into uapi directory
The rvt_rwqe and rvt_rwq struct elements are shared between rdmavt and the providers but are not in uapi directory. As per the comment in https://marc.info/?l=linux-rdma&m=152296522708522&w=2, The hfi1 driver and the rdma core driver are not using shared structures in the uapi directory.
Move rvt_rwqe and rvt_rwq struct into rvt-abi.h header in uapi directory.
Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Reviewed-by: Michael J. Ruhl <michael.j.ruhl@intel.com> Signed-off-by: Kamenee Arumugam <kamenee.arumugam@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
show more ...
|
#
239b0e52 |
| 28-Jun-2019 |
Kamenee Arumugam <kamenee.arumugam@intel.com> |
IB/hfi1: Move rvt_cq_wc struct into uapi directory
The rvt_cq_wc struct elements are shared between rdmavt and the providers but not in uapi directory. As per the comment in https://marc.info/?l=li
IB/hfi1: Move rvt_cq_wc struct into uapi directory
The rvt_cq_wc struct elements are shared between rdmavt and the providers but not in uapi directory. As per the comment in https://marc.info/?l=linux-rdma&m=152296522708522&w=2 The hfi1 driver and the rdma core driver are not using shared structures in the uapi directory.
In that case, move rvt_cq_wc struct into the rvt-abi.h header file and create a rvt_k_cq_w for the kernel completion queue.
Signed-off-by: Kamenee Arumugam <kamenee.arumugam@intel.com> Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
show more ...
|
Revision tags: v5.1.15, v5.1.14, v5.1.13, v5.1.12, v5.1.11, v5.1.10 |
|
#
4a9ceb7d |
| 13-Jun-2019 |
Mike Marciniszyn <mike.marciniszyn@intel.com> |
IB/{rdmavt, qib, hfi1}: Convert to new completion API
Convert all completions to use the new completion routine that fixes a race between post send and completion where fields from a SWQE can be rea
IB/{rdmavt, qib, hfi1}: Convert to new completion API
Convert all completions to use the new completion routine that fixes a race between post send and completion where fields from a SWQE can be read after SWQE has been freed.
This patch also addresses issues reported in https://marc.info/?l=linux-kernel&m=155656897409107&w=2.
The reserved operation path has no need for any barrier.
The barrier for the other path is addressed by the smp_load_acquire() barrier.
Cc: Andrea Parri <andrea.parri@amarulasolutions.com> Reviewed-by: Michael J. Ruhl <michael.j.ruhl@intel.com> Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
show more ...
|
#
f56044d6 |
| 13-Jun-2019 |
Mike Marciniszyn <mike.marciniszyn@intel.com> |
IB/rdmavt: Add new completion inline
There is opencoded send completion logic all over all the drivers.
We need to convert to this routine to enforce ordering issues for completions. This routine
IB/rdmavt: Add new completion inline
There is opencoded send completion logic all over all the drivers.
We need to convert to this routine to enforce ordering issues for completions. This routine fixes an ordering issue where the read of the SWQE fields necessary for creating the completion can race with a post send if the post send catches a send queue at the edge of being full. Is is possible in that situation to read SWQE fields that are being written.
This new routine insures that SWQE fields are read prior to advancing the index that post send uses to determine queue fullness.
Reviewed-by: Michael J. Ruhl <michael.j.ruhl@intel.com> Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
show more ...
|
Revision tags: v5.1.9, v5.1.8, v5.1.7, v5.1.6, v5.1.5, v5.1.4, v5.1.3, v5.1.2, v5.1.1, v5.0.14, v5.1, v5.0.13, v5.0.12, v5.0.11, v5.0.10, v5.0.9, v5.0.8 |
|
#
d40f69c9 |
| 12-Apr-2019 |
Mike Marciniszyn <mike.marciniszyn@intel.com> |
IB/{rdmavt, qib, hfi1}: Use new routine to release reference counts
The reference count adjustments on reference count completion are open coded throughout.
Add a routine to do all reference count
IB/{rdmavt, qib, hfi1}: Use new routine to release reference counts
The reference count adjustments on reference count completion are open coded throughout.
Add a routine to do all reference count adjustments and use.
Reviewed-by: Michael J. Ruhl <michael.j.ruhl@intel.com> Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
show more ...
|
#
715ab1a8 |
| 11-Apr-2019 |
Mike Marciniszyn <mike.marciniszyn@intel.com> |
IB/rdmavt: Fix ab/ba include issues
The currently include file ordering for rdmavt headers has an ab/ba include issue the precludes using inlines from rdma_vt.h in rdmavt_qp.h.
At the heart of the
IB/rdmavt: Fix ab/ba include issues
The currently include file ordering for rdmavt headers has an ab/ba include issue the precludes using inlines from rdma_vt.h in rdmavt_qp.h.
At the heart of the issue is that rdma_vt.h includes rdmavt_qp.h.
Fix the ordering issue by adjusting rdma_vt.h to not require rdmavt_qp.h and move qp related inlines to rdmavt_qp.h.
Additionally, promote rvt_mmap_info to rdma_vt.h since it is shared by rdmavt_cq.h and rdmavt_qp.h.
Reviewed-by: Michael J. Ruhl <michael.j.ruhl@intel.com> Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
show more ...
|
#
ea752bc5 |
| 11-Apr-2019 |
Kaike Wan <kaike.wan@intel.com> |
IB/{rdmavt, hfi1): Miscellaneous comment fixes
This patch fixes miscellaneous comment errors.
Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Kaike Wan <kaike.wan@intel.co
IB/{rdmavt, hfi1): Miscellaneous comment fixes
This patch fixes miscellaneous comment errors.
Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Kaike Wan <kaike.wan@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
show more ...
|
Revision tags: v5.0.7, v5.0.6, v5.0.5, v5.0.4, v5.0.3, v4.19.29, v5.0.2, v4.19.28, v5.0.1, v4.19.27, v5.0, v4.19.26, v4.19.25, v4.19.24, v4.19.23, v4.19.22, v4.19.21, v4.19.20, v4.19.19, v4.19.18 |
|
#
3c6cb20a |
| 23-Jan-2019 |
Kaike Wan <kaike.wan@intel.com> |
IB/hfi1: Add TID RDMA WRITE functionality into RDMA verbs
This patch integrates TID RDMA WRITE protocol into normal RDMA verbs framework. The TID RDMA WRITE protocol is an end-to-end protocol betwee
IB/hfi1: Add TID RDMA WRITE functionality into RDMA verbs
This patch integrates TID RDMA WRITE protocol into normal RDMA verbs framework. The TID RDMA WRITE protocol is an end-to-end protocol between the hfi1 drivers on two OPA nodes that converts a qualified RDMA WRITE request into a TID RDMA WRITE request to avoid data copying on the responder side.
Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Mitko Haralanov <mitko.haralanov@intel.com> Signed-off-by: Kaike Wan <kaike.wan@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
show more ...
|
#
4f9264d1 |
| 23-Jan-2019 |
Kaike Wan <kaike.wan@intel.com> |
IB/hfi1: Add an s_acked_ack_queue pointer
The s_ack_queue is managed by two pointers into the ring: r_head_ack_queue and s_tail_ack_queue. r_head_ack_queue is the index of where the next received re
IB/hfi1: Add an s_acked_ack_queue pointer
The s_ack_queue is managed by two pointers into the ring: r_head_ack_queue and s_tail_ack_queue. r_head_ack_queue is the index of where the next received request is going to be placed and s_tail_ack_queue is the entry of the request currently being processed. This works perfectly fine for normal Verbs as the requests are processed one at a time and the s_tail_ack_queue is not moved until the request that it points to is fully completed.
In this fashion, s_tail_ack_queue constantly chases r_head_ack_queue and the two pointers can easily be used to determine "queue full" and "queue empty" conditions.
The detection of these two conditions are imported in determining when an old entry can safely be overwritten with a new received request and the resources associated with the old request be safely released.
When pipelined TID RDMA WRITE is introduced into this mix, things look very different. r_head_ack_queue is still the point at which a newly received request will be inserted, s_tail_ack_queue is still the currently processed request. However, with pipelined TID RDMA WRITE requests, s_tail_ack_queue moves to the next request once all TID RDMA WRITE responses for that request have been sent. The rest of the protocol for a particular request is managed by other pointers specific to TID RDMA - r_tid_tail and r_tid_ack - which point to the entries for which the next TID RDMA DATA packets are going to arrive and the request for which the next TID RDMA ACK packets are to be generated, respectively.
What this means is that entries in the ring, which are "behind" s_tail_ack_queue (entries which s_tail_ack_queue has gone past) are no longer considered complete. This is where the problem is - a newly received request could potentially overwrite a still active TID RDMA WRITE request.
The reason why the TID RDMA pointers trail s_tail_ack_queue is that the normal Verbs send engine uses s_tail_ack_queue as the pointer for the next response. Since TID RDMA WRITE responses are processed by the normal Verbs send engine, s_tail_ack_queue had to be moved to the next entry once all TID RDMA WRITE response packets were sent to get the desired pipelining between requests. Doing otherwise would mean that the normal Verbs send engine would not be able to send the TID RDMA WRITE responses for the next TID RDMA request until the current one is fully completed.
This patch introduces the s_acked_ack_queue index to point to the next request to complete on the responder side. For requests other than TID RDMA WRITE, s_acked_ack_queue should always be kept in sync with s_tail_ack_queue. For TID RDMA WRITE request, it may fall behind s_tail_ack_queue.
Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Mitko Haralanov <mitko.haralanov@intel.com> Signed-off-by: Kaike Wan <kaike.wan@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
show more ...
|
#
039cd3da |
| 23-Jan-2019 |
Kaike Wan <kaike.wan@intel.com> |
IB/hfi1: Increment the retry timeout value for TID RDMA READ request
The RC retry timeout value is based on the estimated time for the response packet to come back. However, for TID RDMA READ reques
IB/hfi1: Increment the retry timeout value for TID RDMA READ request
The RC retry timeout value is based on the estimated time for the response packet to come back. However, for TID RDMA READ request, due to the use of header suppression, the driver is normally not notified for each incoming response packet until the last TID RDMA READ response packet. Consequently, the retry timeout value should be extended to cover the transaction time for the entire length of a segment (default 256K) instead of that for a single packet. This patch addresses the issue by introducing new retry timer functions to account for multiple packets and wrapper functions for backward compatibility.
Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Kaike Wan <kaike.wan@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
show more ...
|
#
838b6fd2 |
| 23-Jan-2019 |
Kaike Wan <kaike.wan@intel.com> |
IB/hfi1: TID RDMA RcvArray programming and TID allocation
TID entries are used by hfi1 hardware to receive data payload from incoming packets directly into a user buffer and thus avoid data copying
IB/hfi1: TID RDMA RcvArray programming and TID allocation
TID entries are used by hfi1 hardware to receive data payload from incoming packets directly into a user buffer and thus avoid data copying by software. This patch implements the functions for TID allocation, freeing, and programming TID RcvArray entries in hardware for kernel clients. TID entries are managed via lists of TID groups similar to PSM. Furthermore, to track TID resource allocation for each request, software flows are also allocated and freed as needed. Since software flows consume large amount of memory for tracking TID allocation and freeing, it is generally desirable to allocate them dynamically in the send queue and only for TID RDMA requests, but pre-allocate them for receive queue because the send queue could have thousands of entries while the receive queue has only a limited number of entries.
Signed-off-by: Mitko Haralanov <mitko.haralanov@intel.com> Signed-off-by: Ashutosh Dixit <ashutosh.dixit@intel.com> Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Kaike Wan <kaike.wan@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
show more ...
|
#
385156c5 |
| 23-Jan-2019 |
Kaike Wan <kaike.wan@intel.com> |
IB/hfi: Move RC functions into a header file
This patch moves some RC helper functions into a header file so that they can be called from both RC and TID RDMA functions. In addition, a common funct
IB/hfi: Move RC functions into a header file
This patch moves some RC helper functions into a header file so that they can be called from both RC and TID RDMA functions. In addition, a common function for rewinding a request is created in rdmavt so that it can be shared between qib and hfi1 driver.
Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Mitko Haralanov <mitko.haralanov@intel.com> Signed-off-by: Kaike Wan <kaike.wan@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
show more ...
|