#
2dfdcd88 |
| 19-Aug-2019 |
Chuck Lever <chuck.lever@oracle.com> |
xprtrdma: Rename CQE field in Receive trace points
Make the field name the same for all trace points that handle pointers to struct rpcrdma_rep. That makes it easy to grep for matching rep points in
xprtrdma: Rename CQE field in Receive trace points
Make the field name the same for all trace points that handle pointers to struct rpcrdma_rep. That makes it easy to grep for matching rep points in trace output.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
show more ...
|
Revision tags: v5.2.9, v5.2.8, v5.2.7, v5.2.6, v5.2.5, v5.2.4, v5.2.3, v5.2.2, v5.2.1, v5.2, v5.1.16, v5.1.15, v5.1.14, v5.1.13 |
|
#
675dd90a |
| 19-Jun-2019 |
Chuck Lever <chuck.lever@oracle.com> |
xprtrdma: Modernize ops->connect
Adapt and apply changes that were made to the TCP socket connect code. See the following commits for details on the purpose of these changes:
Commit 7196dbb02ea0 ("
xprtrdma: Modernize ops->connect
Adapt and apply changes that were made to the TCP socket connect code. See the following commits for details on the purpose of these changes:
Commit 7196dbb02ea0 ("SUNRPC: Allow changing of the TCP timeout parameters on the fly") Commit 3851f1cdb2b8 ("SUNRPC: Limit the reconnect backoff timer to the max RPC message timeout") Commit 02910177aede ("SUNRPC: Fix reconnection timeouts")
Some common transport code is moved to xprt.c to satisfy the code duplication police.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
show more ...
|
#
d8099fed |
| 19-Jun-2019 |
Chuck Lever <chuck.lever@oracle.com> |
xprtrdma: Reduce context switching due to Local Invalidation
Since commit ba69cd122ece ("xprtrdma: Remove support for FMR memory registration"), FRWR is the only supported memory registration mode.
xprtrdma: Reduce context switching due to Local Invalidation
Since commit ba69cd122ece ("xprtrdma: Remove support for FMR memory registration"), FRWR is the only supported memory registration mode.
We can take advantage of the asynchronous nature of FRWR's LOCAL_INV Work Requests to get rid of the completion wait by having the LOCAL_INV completion handler take care of DMA unmapping MRs and waking the upper layer RPC waiter.
This eliminates two context switches when local invalidation is necessary. As a side benefit, we will no longer need the per-xprt deferred completion work queue.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
show more ...
|
#
84756894 |
| 19-Jun-2019 |
Chuck Lever <chuck.lever@oracle.com> |
xprtrdma: Remove fr_state
Now that both the Send and Receive completions are handled in process context, it is safe to DMA unmap and return MRs to the free or recycle lists directly in the completio
xprtrdma: Remove fr_state
Now that both the Send and Receive completions are handled in process context, it is safe to DMA unmap and return MRs to the free or recycle lists directly in the completion handlers.
Doing this means rpcrdma_frwr no longer needs to track the state of each MR, meaning that a VALID or FLUSHED MR can no longer appear on an xprt's MR free list. Thus there is no longer a need to track the MR's registration state in rpcrdma_frwr.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
show more ...
|
#
05eb06d8 |
| 19-Jun-2019 |
Chuck Lever <chuck.lever@oracle.com> |
xprtrdma: Fix occasional transport deadlock
Under high I/O workloads, I've noticed that an RPC/RDMA transport occasionally deadlocks (IOPS goes to zero, and doesn't recover). Diagnosis shows that th
xprtrdma: Fix occasional transport deadlock
Under high I/O workloads, I've noticed that an RPC/RDMA transport occasionally deadlocks (IOPS goes to zero, and doesn't recover). Diagnosis shows that the sendctx queue is empty, but when sendctxs are returned to the queue, the xprt_write_space wake-up never occurs. The wake-up logic in rpcrdma_sendctx_put_locked is racy.
I noticed that both EMPTY_SCQ and XPRT_WRITE_SPACE are implemented via an atomic bit. Just one of those is sufficient. Removing EMPTY_SCQ in favor of the generic bit mechanism makes the deadlock un-reproducible.
Without EMPTY_SCQ, rpcrdma_buffer::rb_flags is no longer used and is therefore removed.
Unfortunately this patch does not apply cleanly to stable. If needed, someone will have to port it and test it.
Fixes: 2fad659209d5 ("xprtrdma: Wait on empty sendctx queue") Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
show more ...
|
#
1310051c |
| 19-Jun-2019 |
Chuck Lever <chuck.lever@oracle.com> |
xprtrdma: Replace use of xdr_stream_pos in rpcrdma_marshal_req
This is a latent bug. xdr_stream_pos works by subtracting xdr_stream::nwords from xdr_buf::len. But xdr_stream::nwords is not initializ
xprtrdma: Replace use of xdr_stream_pos in rpcrdma_marshal_req
This is a latent bug. xdr_stream_pos works by subtracting xdr_stream::nwords from xdr_buf::len. But xdr_stream::nwords is not initialized by xdr_init_encode().
It works today only because all fields in rpcrdma_req::rl_stream are initialized to zero by rpcrdma_req_create, making the subtraction in xdr_stream_pos always a no-op.
I found this issue via code inspection. It was introduced by commit 39f4cd9e9982 ("xprtrdma: Harden chunk list encoding against send buffer overflow"), but the code has changed enough since then that this fix can't be automatically applied to stable.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
show more ...
|
#
10694ac9 |
| 19-Jun-2019 |
Chuck Lever <chuck.lever@oracle.com> |
xprtrdma: Fix a BUG when tracing is enabled with NFSv4.1 on RDMA
A backchannel reply does not set task->tk_client.
Fixes: 0c77668ddb4e ("SUNRPC: Introduce trace points in ... ") Signed-off-by: Chuc
xprtrdma: Fix a BUG when tracing is enabled with NFSv4.1 on RDMA
A backchannel reply does not set task->tk_client.
Fixes: 0c77668ddb4e ("SUNRPC: Introduce trace points in ... ") Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
show more ...
|
Revision tags: v5.1.12, v5.1.11, v5.1.10, v5.1.9, v5.1.8, v5.1.7, v5.1.6, v5.1.5, v5.1.4, v5.1.3, v5.1.2, v5.1.1, v5.0.14, v5.1, v5.0.13, v5.0.12, v5.0.11, v5.0.10 |
|
#
17e4c443 |
| 24-Apr-2019 |
Chuck Lever <chuck.lever@oracle.com> |
xprtrdma: Trace marshaling failures
Record an event when rpcrdma_marshal_req returns a non-zero return value to help track down why an xprt close might have occurred.
Signed-off-by: Chuck Lever <ch
xprtrdma: Trace marshaling failures
Record an event when rpcrdma_marshal_req returns a non-zero return value to help track down why an xprt close might have occurred.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
show more ...
|
Revision tags: v5.0.9, v5.0.8, v5.0.7, v5.0.6, v5.0.5, v5.0.4, v5.0.3, v4.19.29, v5.0.2, v4.19.28, v5.0.1, v4.19.27, v5.0, v4.19.26, v4.19.25, v4.19.24, v4.19.23, v4.19.22, v4.19.21 |
|
#
0c77668d |
| 11-Feb-2019 |
Chuck Lever <chuck.lever@oracle.com> |
SUNRPC: Introduce trace points in rpc_auth_gss.ko
Add infrastructure for trace points in the RPC_AUTH_GSS kernel module, and add a few sample trace points. These report exceptional or unexpected eve
SUNRPC: Introduce trace points in rpc_auth_gss.ko
Add infrastructure for trace points in the RPC_AUTH_GSS kernel module, and add a few sample trace points. These report exceptional or unexpected events, and observe the assignment of GSS sequence numbers.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
show more ...
|
Revision tags: v4.19.20, v4.19.19, v4.19.18, v4.19.17, v4.19.16, v4.19.15, v4.19.14, v4.19.13, v4.19.12, v4.19.11, v4.19.10, v4.19.9, v4.19.8 |
|
#
07e10308 |
| 07-Dec-2018 |
Chuck Lever <chuck.lever@oracle.com> |
xprtrdma: Prevent leak of rpcrdma_rep objects
If a reply has been processed but the RPC is later retransmitted anyway, the req->rl_reply field still contains the only pointer to the old rpcrdma rep.
xprtrdma: Prevent leak of rpcrdma_rep objects
If a reply has been processed but the RPC is later retransmitted anyway, the req->rl_reply field still contains the only pointer to the old rpcrdma rep. When the next reply comes in, the reply handler will stomp on the rl_reply field, leaking the old rep.
A trace event is added to capture such leaks.
This problem seems to be worsened by the restructuring of the RPC Call path in v4.20. Fully addressing this issue will require at least a re-architecture of the disconnect logic, which is not appropriate during -rc.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
show more ...
|
#
53b2c1cb |
| 19-Dec-2018 |
Chuck Lever <chuck.lever@oracle.com> |
xprtrdma: Trace mapping, alloc, and dereg failures
These are rare, but can be helpful at tracking down DMAR and other problems.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: An
xprtrdma: Trace mapping, alloc, and dereg failures
These are rare, but can be helpful at tracking down DMAR and other problems.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
show more ...
|
#
395069fc |
| 19-Dec-2018 |
Chuck Lever <chuck.lever@oracle.com> |
xprtrdma: Add trace points for calls to transport switch methods
Name them "trace_xprtrdma_op_*" so they can be easily enabled as a group. No trace point is added where the generic layer already has
xprtrdma: Add trace points for calls to transport switch methods
Name them "trace_xprtrdma_op_*" so they can be easily enabled as a group. No trace point is added where the generic layer already has observability.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
show more ...
|
#
aba11831 |
| 19-Dec-2018 |
Chuck Lever <chuck.lever@oracle.com> |
xprtrdma: Clean up of xprtrdma chunk trace points
The chunk-related trace points capture nearly the same information as the MR-related trace points.
Also, rename them so globbing can be used to ena
xprtrdma: Clean up of xprtrdma chunk trace points
The chunk-related trace points capture nearly the same information as the MR-related trace points.
Also, rename them so globbing can be used to enable or disable these trace points more easily.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
show more ...
|
#
6ceea368 |
| 19-Dec-2018 |
Chuck Lever <chuck.lever@oracle.com> |
xprtrdma: Refactor Receive accounting
Clean up: Divide the work cleanly:
- rpcrdma_wc_receive is responsible only for RDMA Receives - rpcrdma_reply_handler is responsible only for RPC Replies - the
xprtrdma: Refactor Receive accounting
Clean up: Divide the work cleanly:
- rpcrdma_wc_receive is responsible only for RDMA Receives - rpcrdma_reply_handler is responsible only for RPC Replies - the posted send and receive counts both belong in rpcrdma_ep
Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
show more ...
|
Revision tags: v4.19.7, v4.19.6, v4.19.5, v4.19.4, v4.18.20, v4.19.3, v4.18.19, v4.19.2, v4.18.18, v4.18.17, v4.19.1, v4.19, v4.18.16, v4.18.15, v4.18.14, v4.18.13, v4.18.12 |
|
#
470443e0 |
| 01-Oct-2018 |
Chuck Lever <chuck.lever@oracle.com> |
xprtrdma: Squelch a sparse warning
linux/include/trace/events/rpcrdma.h:501:1: warning: expression using sizeof bool linux/include/trace/events/rpcrdma.h:501:1: warning: odd constant _Bool cast (fff
xprtrdma: Squelch a sparse warning
linux/include/trace/events/rpcrdma.h:501:1: warning: expression using sizeof bool linux/include/trace/events/rpcrdma.h:501:1: warning: odd constant _Bool cast (ffffffffffffffff becomes 1)
Fixes: ab03eff58eb5 ("xprtrdma: Add trace points in RPC Call ... ") Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
show more ...
|
#
f9521d53 |
| 01-Oct-2018 |
Chuck Lever <chuck.lever@oracle.com> |
xprtrdma: Rename rpcrdma_qp_async_error_upcall
Clean up: Use a function name that is consistent with the RDMA core API and with other consumers. Because this is a function that is invoked from outsi
xprtrdma: Rename rpcrdma_qp_async_error_upcall
Clean up: Use a function name that is consistent with the RDMA core API and with other consumers. Because this is a function that is invoked from outside the rpcrdma.ko module, add an appropriate documenting comment.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
show more ...
|
#
ae38288e |
| 01-Oct-2018 |
Chuck Lever <chuck.lever@oracle.com> |
xprtrdma: Rename rpcrdma_conn_upcall
Clean up: Use a function name that is consistent with the RDMA core API and with other consumers. Because this is a function that is invoked from outside the rpc
xprtrdma: Rename rpcrdma_conn_upcall
Clean up: Use a function name that is consistent with the RDMA core API and with other consumers. Because this is a function that is invoked from outside the rpcrdma.ko module, add an appropriate documenting comment.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
show more ...
|
#
d379eaa8 |
| 01-Oct-2018 |
Chuck Lever <chuck.lever@oracle.com> |
xprtrdma: Name MR trace events consistently
Clean up the names of trace events related to MRs so that it's easy to enable these with a glob.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Sign
xprtrdma: Name MR trace events consistently
Clean up the names of trace events related to MRs so that it's easy to enable these with a glob.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
show more ...
|
#
61da886b |
| 01-Oct-2018 |
Chuck Lever <chuck.lever@oracle.com> |
xprtrdma: Explicitly resetting MRs is no longer necessary
When a memory operation fails, the MR's driver state might not match its hardware state. The only reliable recourse is to dereg the MR. This
xprtrdma: Explicitly resetting MRs is no longer necessary
When a memory operation fails, the MR's driver state might not match its hardware state. The only reliable recourse is to dereg the MR. This is done in ->ro_recover_mr, which then attempts to allocate a fresh MR to replace the released MR.
Since commit e2ac236c0b651 ("xprtrdma: Allocate MRs on demand"), xprtrdma dynamically allocates MRs. It can add more MRs whenever they are needed.
That makes it possible to simply release an MR when a memory operation fails, instead of "recovering" it. It will automatically be replaced by the on-demand MR allocator.
This commit is a little larger than I wanted, but it replaces ->ro_recover_mr, rb_recovery_lock, rb_recovery_worker, and the rb_stale_mrs list with a generic work queue.
Since MRs are no longer orphaned, the mrs_orphaned metric is no longer used.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
show more ...
|
Revision tags: v4.18.11, v4.18.10, v4.18.9, v4.18.7, v4.18.6, v4.18.5, v4.17.18, v4.18.4, v4.18.3, v4.17.17, v4.18.2, v4.17.16, v4.17.15, v4.18.1, v4.18, v4.17.14, v4.17.13, v4.17.12, v4.17.11, v4.17.10, v4.17.9, v4.17.8, v4.17.7, v4.17.6, v4.17.5, v4.17.4, v4.17.3, v4.17.2, v4.17.1, v4.17 |
|
#
8335640c |
| 04-May-2018 |
Chuck Lever <chuck.lever@oracle.com> |
xprtrdma: Add trace_xprtrdma_dma_map(mr)
Matches trace_xprtrdma_dma_unmap(mr).
Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
|
#
bd2abef3 |
| 07-May-2018 |
Chuck Lever <chuck.lever@oracle.com> |
svcrdma: Trace key RDMA API events
This includes: * Posting on the Send and Receive queues * Send, Receive, Read, and Write completion * Connect upcalls * QP errors
Signed-off-by: Chuck Lev
svcrdma: Trace key RDMA API events
This includes: * Posting on the Send and Receive queues * Send, Receive, Read, and Write completion * Connect upcalls * QP errors
Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
show more ...
|
#
98895edb |
| 07-May-2018 |
Chuck Lever <chuck.lever@oracle.com> |
svcrdma: Trace key RPC/RDMA protocol events
This includes: * Transport accept and tear-down * Decisions about using Write and Reply chunks * Each RDMA segment that is handled * Whenever an R
svcrdma: Trace key RPC/RDMA protocol events
This includes: * Transport accept and tear-down * Decisions about using Write and Reply chunks * Each RDMA segment that is handled * Whenever an RDMA_ERR is sent
As a clean-up, I've standardized the order of the includes, and removed some now redundant dprintk call sites.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
show more ...
|
#
a7986f09 |
| 04-May-2018 |
Chuck Lever <chuck.lever@oracle.com> |
xprtrdma: Remove rpcrdma_ep_{post_recv, post_extra_recv}
Clean up: These functions are no longer used.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Anna Schumaker <Anna.Schuma
xprtrdma: Remove rpcrdma_ep_{post_recv, post_extra_recv}
Clean up: These functions are no longer used.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
show more ...
|
#
7c8d9e7c |
| 04-May-2018 |
Chuck Lever <chuck.lever@oracle.com> |
xprtrdma: Move Receive posting to Receive handler
Receive completion and Reply handling are done by a BOUND workqueue, meaning they run on only one CPU.
Posting receives is currently done in the se
xprtrdma: Move Receive posting to Receive handler
Receive completion and Reply handling are done by a BOUND workqueue, meaning they run on only one CPU.
Posting receives is currently done in the send_request path, which on large systems is typically done on a different CPU than the one handling Receive completions. This results in movement of Receive-related cachelines between the sending and receiving CPUs.
More importantly, it means that currently Receives are posted while the transport's write lock is held, which is unnecessary and costly.
Finally, allocation of Receive buffers is performed on-demand in the Receive completion handler. This helps guarantee that they are allocated on the same NUMA node as the CPU that handles Receive completions.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
show more ...
|
#
0e0b854c |
| 04-May-2018 |
Chuck Lever <chuck.lever@oracle.com> |
xprtrdma: Clean up Receive trace points
For clarity, report the posting and completion of Receive CQEs.
Also, the wc->byte_len field contains garbage if wc->status is non-zero, and the vendor error
xprtrdma: Clean up Receive trace points
For clarity, report the posting and completion of Receive CQEs.
Also, the wc->byte_len field contains garbage if wc->status is non-zero, and the vendor error field contains garbage if wc->status is zero. For readability, don't save those fields in those cases.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
show more ...
|