History log of /openbmc/linux/net/sunrpc/xprtrdma/svc_rdma_sendto.c (Results 176 – 200 of 246)
Revision (<<< Hide revision tags) (Show revision tags >>>) Date Author Comments
# 23262790 07-May-2018 Chuck Lever <chuck.lever@oracle.com>

svcrdma: Clean up Send SGE accounting

Clean up: Since there's already a svc_rdma_op_ctxt being passed
around with the running count of mapped SGEs, drop unneeded
parameters to svc_rd

svcrdma: Clean up Send SGE accounting

Clean up: Since there's already a svc_rdma_op_ctxt being passed
around with the running count of mapped SGEs, drop unneeded
parameters to svc_rdma_post_send_wr().

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>

show more ...


# f016f305 07-May-2018 Chuck Lever <chuck.lever@oracle.com>

svcrdma: Refactor svc_rdma_dma_map_buf

Clean up: svc_rdma_dma_map_buf does mostly the same thing as
svc_rdma_dma_map_page, so let's fold these together.

Signed-off-by: Chuck Lev

svcrdma: Refactor svc_rdma_dma_map_buf

Clean up: svc_rdma_dma_map_buf does mostly the same thing as
svc_rdma_dma_map_page, so let's fold these together.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>

show more ...


# 3316f063 07-May-2018 Chuck Lever <chuck.lever@oracle.com>

svcrdma: Persistently allocate and DMA-map Receive buffers

The current Receive path uses an array of pages which are allocated
and DMA mapped when each Receive WR is posted, and then han

svcrdma: Persistently allocate and DMA-map Receive buffers

The current Receive path uses an array of pages which are allocated
and DMA mapped when each Receive WR is posted, and then handed off
to the upper layer in rqstp::rq_arg. The page flip releases unused
pages in the rq_pages pagelist. This mechanism introduces a
significant amount of overhead.

So instead, kmalloc the Receive buffer, and leave it DMA-mapped
while the transport remains connected. This confers a number of
benefits:

* Each Receive WR requires only one receive SGE, no matter how large
the inline threshold is. This helps the server-side NFS/RDMA
transport operate on less capable RDMA devices.

* The Receive buffer is left allocated and mapped all the time. This
relieves svc_rdma_post_recv from the overhead of allocating and
DMA-mapping a fresh buffer.

* svc_rdma_wc_receive no longer has to DMA unmap the Receive buffer.
It has to DMA sync only the number of bytes that were received.

* svc_rdma_build_arg_xdr no longer has to free a page in rq_pages
for each page in the Receive buffer, making it a constant-time
function.

* The Receive buffer is now plugged directly into the rq_arg's
head[0].iov_vec, and can be larger than a page without spilling
over into rq_arg's page list. This enables simplification of
the RDMA Read path in subsequent patches.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>

show more ...


# 3a88092e 07-May-2018 Chuck Lever <chuck.lever@oracle.com>

svcrdma: Preserve Receive buffer until svc_rdma_sendto

Rather than releasing the incoming svc_rdma_recv_ctxt at the end of
svc_rdma_recvfrom, hold onto it until svc_rdma_sendto.

svcrdma: Preserve Receive buffer until svc_rdma_sendto

Rather than releasing the incoming svc_rdma_recv_ctxt at the end of
svc_rdma_recvfrom, hold onto it until svc_rdma_sendto.

This permits the contents of the Receive buffer to be preserved
through svc_process and then referenced directly in sendto as it
constructs Write and Reply chunks to return to the client.

The real changes will come in subsequent patches.

Note: I cannot use ->xpo_release_rqst for this purpose because that
is called _before_ ->xpo_sendto. svc_rdma_sendto uses information in
the received Call transport header to construct the Reply transport
header, which is preserved in the RPC's Receive buffer.

The historical comment in svc_send() isn't helpful: it is already
obvious that ->xpo_release_rqst is being called before ->xpo_sendto,
but there is no explanation for this ordering going back to the
beginning of the git era.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>

show more ...


# ecf85b23 07-May-2018 Chuck Lever <chuck.lever@oracle.com>

svcrdma: Introduce svc_rdma_recv_ctxt

svc_rdma_op_ctxt's are pre-allocated and maintained on a per-xprt
free list. This eliminates the overhead of calling kmalloc / kfree,
both of wh

svcrdma: Introduce svc_rdma_recv_ctxt

svc_rdma_op_ctxt's are pre-allocated and maintained on a per-xprt
free list. This eliminates the overhead of calling kmalloc / kfree,
both of which grab a globally shared lock that disables interrupts.
To reduce contention further, separate the use of these objects in
the Receive and Send paths in svcrdma.

Subsequent patches will take advantage of this separation by
allocating real resources which are then cached in these objects.
The allocations are freed when the transport is torn down.

I've renamed the structure so that static type checking can be used
to ensure that uses of op_ctxt and recv_ctxt are not confused. As an
additional clean up, structure fields are renamed to conform with
kernel coding conventions.

As a final clean up, helpers related to recv_ctxt are moved closer
to the functions that use them.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>

show more ...


# bd2abef3 07-May-2018 Chuck Lever <chuck.lever@oracle.com>

svcrdma: Trace key RDMA API events

This includes:
* Posting on the Send and Receive queues
* Send, Receive, Read, and Write completion
* Connect upcalls
* QP errors

svcrdma: Trace key RDMA API events

This includes:
* Posting on the Send and Receive queues
* Send, Receive, Read, and Write completion
* Connect upcalls
* QP errors

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>

show more ...


# 98895edb 07-May-2018 Chuck Lever <chuck.lever@oracle.com>

svcrdma: Trace key RPC/RDMA protocol events

This includes:
* Transport accept and tear-down
* Decisions about using Write and Reply chunks
* Each RDMA segment that is handl

svcrdma: Trace key RPC/RDMA protocol events

This includes:
* Transport accept and tear-down
* Decisions about using Write and Reply chunks
* Each RDMA segment that is handled
* Whenever an RDMA_ERR is sent

As a clean-up, I've standardized the order of the includes, and
removed some now redundant dprintk call sites.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>

show more ...


# bcf3ffd4 07-May-2018 Chuck Lever <chuck.lever@oracle.com>

svcrdma: Add proper SPDX tags for NetApp-contributed source

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>


Revision tags: v4.16, v4.15
# 48272502 03-Jan-2018 Chuck Lever <chuck.lever@oracle.com>

svcrdma: Post Receives in the Receive completion handler

This change improves Receive efficiency by posting Receives only
on the same CPU that handles Receive completion. Improved latenc

svcrdma: Post Receives in the Receive completion handler

This change improves Receive efficiency by posting Receives only
on the same CPU that handles Receive completion. Improved latency
and throughput has been noted with this change.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>

show more ...


Revision tags: v4.13.16, v4.14, v4.13.5, v4.13
# b20dae70 13-Jul-2017 Colin Ian King <colin.king@canonical.com>

svcrdma: fix an incorrect check on -E2BIG and -EINVAL

The current check will always be true and will always jump to
err1, this looks dubious to me. I believe && should be used
instea

svcrdma: fix an incorrect check on -E2BIG and -EINVAL

The current check will always be true and will always jump to
err1, this looks dubious to me. I believe && should be used
instead of ||.

Detected by CoverityScan, CID#1450120 ("Logically Dead Code")

Fixes: 107c1d0a991a ("svcrdma: Avoid Send Queue overflow")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Reviewed-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>

show more ...


Revision tags: v4.12
# 107c1d0a 23-Jun-2017 Chuck Lever <chuck.lever@oracle.com>

svcrdma: Avoid Send Queue overflow

Sanity case: Catch the case where more Work Requests are being
posted to the Send Queue than there are Send Queue Entries.

This might happen i

svcrdma: Avoid Send Queue overflow

Sanity case: Catch the case where more Work Requests are being
posted to the Send Queue than there are Send Queue Entries.

This might happen if a client sends a chunk with more segments than
there are SQEs for the transport. The server can't send that reply,
so the transport will deadlock unless the client drops the RPC.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>

show more ...


# 91a08eae 23-Jun-2017 Chuck Lever <chuck.lever@oracle.com>

svcrdma: Squelch disconnection messages

The server displays "svcrdma: failed to post Send WR (-107)" in the
kernel log when the client disconnects. This could flood the server's
log,

svcrdma: Squelch disconnection messages

The server displays "svcrdma: failed to post Send WR (-107)" in the
kernel log when the client disconnects. This could flood the server's
log, so remove the message.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>

show more ...


Revision tags: v4.10.17, v4.10.16, v4.10.15, v4.10.14, v4.10.13, v4.10.12, v4.10.11, v4.10.10
# 2cf32924 09-Apr-2017 Chuck Lever <chuck.lever@oracle.com>

svcrdma: Remove the req_map cache

req_maps are no longer used by the send path and can thus be removed.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Reviewed-by: Sagi Gri

svcrdma: Remove the req_map cache

req_maps are no longer used by the send path and can thus be removed.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>

show more ...


# 4757d90b 09-Apr-2017 Chuck Lever <chuck.lever@oracle.com>

svcrdma: Report Write/Reply chunk overruns

Observed at Connectathon 2017.

If a client has underestimated the size of a Write or Reply chunk,
the Linux server writes as much payl

svcrdma: Report Write/Reply chunk overruns

Observed at Connectathon 2017.

If a client has underestimated the size of a Write or Reply chunk,
the Linux server writes as much payload data as it can, then it
recognizes there was a problem and closes the connection without
sending the transport header.

This creates a couple of problems:

<> The client never receives indication of the server-side failure,
so it continues to retransmit the bad RPC. Forward progress on
the transport is blocked.

<> The reply payload pages are not moved out of the svc_rqst, thus
they can be released by the RPC server before the RDMA Writes
have completed.

The new rdma_rw-ized helpers return a distinct error code when a
Write/Reply chunk overrun occurs, so it's now easy for the caller
(svc_rdma_sendto) to recognize this case.

Instead of dropping the connection, post an RDMA_ERROR message. The
client now sees an RDMA_ERROR and can properly terminate the RPC
transaction.

As part of the new logic, set up the same delayed release for these
payload pages as would have occurred in the normal case.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>

show more ...


# 6b19cc5c 09-Apr-2017 Chuck Lever <chuck.lever@oracle.com>

svcrdma: Clean up RDMA_ERROR path

Now that svc_rdma_sendto has been renovated, svc_rdma_send_error can
be refactored to reduce code duplication and remove C structure-
based XDR enco

svcrdma: Clean up RDMA_ERROR path

Now that svc_rdma_sendto has been renovated, svc_rdma_send_error can
be refactored to reduce code duplication and remove C structure-
based XDR encoding. It is also relocated to the source file that
contains its only caller.

This is a refactoring change only.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>

show more ...


# 9a6a180b 09-Apr-2017 Chuck Lever <chuck.lever@oracle.com>

svcrdma: Use rdma_rw API in RPC reply path

The current svcrdma sendto code path posts one RDMA Write WR at a
time. Each of these Writes typically carries a small number of pages
(for

svcrdma: Use rdma_rw API in RPC reply path

The current svcrdma sendto code path posts one RDMA Write WR at a
time. Each of these Writes typically carries a small number of pages
(for instance, up to 30 pages for mlx4 devices). That means a 1MB
NFS READ reply requires 9 ib_post_send() calls for the Write WRs,
and one for the Send WR carrying the actual RPC Reply message.

Instead, use the new rdma_rw API. The details of Write WR chain
construction and memory registration are taken care of in the RDMA
core. svcrdma can focus on the details of the RPC-over-RDMA
protocol. This gives three main benefits:

1. All Write WRs for one RDMA segment are posted in a single chain.
As few as one ib_post_send() for each Write chunk.

2. The Write path can now use FRWR to register the Write buffers.
If the device's maximum page list depth is large, this means a
single Write WR is needed for each RPC's Write chunk data.

3. The new code introduces support for RPCs that carry both a Write
list and a Reply chunk. This combination can be used for an NFSv4
READ where the data payload is large, and thus is removed from the
Payload Stream, but the Payload Stream is still larger than the
inline threshold.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>

show more ...


# c238c4c0 09-Apr-2017 Chuck Lever <chuck.lever@oracle.com>

svcrdma: Clean up svc_rdma_get_inv_rkey()

Replace C structure-based XDR decoding with more portable code that
instead uses pointer arithmetic.

This is a refactoring change only.

svcrdma: Clean up svc_rdma_get_inv_rkey()

Replace C structure-based XDR decoding with more portable code that
instead uses pointer arithmetic.

This is a refactoring change only.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>

show more ...


# c55ab070 09-Apr-2017 Chuck Lever <chuck.lever@oracle.com>

svcrdma: Add helper to save pages under I/O

Clean up: extract the logic to save pages under I/O into a helper to
add a big documenting comment without adding clutter in the send
path

svcrdma: Add helper to save pages under I/O

Clean up: extract the logic to save pages under I/O into a helper to
add a big documenting comment without adding clutter in the send
path.

This is a refactoring change only.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>

show more ...


# 6e6092ca 09-Apr-2017 Chuck Lever <chuck.lever@oracle.com>

svcrdma: Add svc_rdma_map_reply_hdr()

Introduce a helper to DMA-map a reply's transport header before
sending it. This will in part replace the map vector cache.

Signed-off-by:

svcrdma: Add svc_rdma_map_reply_hdr()

Introduce a helper to DMA-map a reply's transport header before
sending it. This will in part replace the map vector cache.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>

show more ...


# 17f5f7f5 09-Apr-2017 Chuck Lever <chuck.lever@oracle.com>

svcrdma: Move send_wr to svc_rdma_op_ctxt

Clean up: Move the ib_send_wr off the stack, and move common code
to post a Send Work Request into a helper.

This is a refactoring chan

svcrdma: Move send_wr to svc_rdma_op_ctxt

Clean up: Move the ib_send_wr off the stack, and move common code
to post a Send Work Request into a helper.

This is a refactoring change only.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>

show more ...


Revision tags: v4.10.9, v4.10.8, v4.10.7, v4.10.6, v4.10.5, v4.10.4, v4.10.3, v4.10.2, v4.10.1, v4.10
# 98fc21d3 07-Feb-2017 Chuck Lever <chuck.lever@oracle.com>

svcrdma: Clean up RPC-over-RDMA Reply header encoder

Replace C structure-based XDR decoding with pointer arithmetic.
Pointer arithmetic is considered more portable, and is used
throu

svcrdma: Clean up RPC-over-RDMA Reply header encoder

Replace C structure-based XDR decoding with pointer arithmetic.
Pointer arithmetic is considered more portable, and is used
throughout the kernel's existing XDR encoders. The gcc optimizer
generates similar assembler code either way.

Byte-swapping before a memory store on x86 typically results in an
instruction pipeline stall. Avoid byte-swapping when encoding a new
header.

svcrdma currently doesn't alter a connection's credit grant value
after the connection has been accepted, so it is effectively a
constant. Cache the byte-swapped value in a separate field.

Christoph suggested pulling the header encoding logic into the only
function that uses it.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>

show more ...


# cbaf5803 07-Feb-2017 Chuck Lever <chuck.lever@oracle.com>

svcrdma: Another sendto chunk list parsing update

Commit 5fdca6531434 ("svcrdma: Renovate sendto chunk list parsing")
missed a spot. svc_rdma_xdr_get_reply_hdr_len() also assumes the

svcrdma: Another sendto chunk list parsing update

Commit 5fdca6531434 ("svcrdma: Renovate sendto chunk list parsing")
missed a spot. svc_rdma_xdr_get_reply_hdr_len() also assumes the
Write list has only one Write chunk. There's no harm in making this
code more general.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>

show more ...


Revision tags: v4.9
# fafedf81 29-Nov-2016 Chuck Lever <chuck.lever@oracle.com>

svcrdma: Further clean-up of svc_rdma_get_inv_rkey()

No longer any need for the dprintk().

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: J. Bruce Fields <bf

svcrdma: Further clean-up of svc_rdma_get_inv_rkey()

No longer any need for the dprintk().

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>

show more ...


# e4eb42ce 29-Nov-2016 Chuck Lever <chuck.lever@oracle.com>

svcrdma: Remove BH-disabled spin locking in svc_rdma_send()

svcrdma's current SQ accounting algorithm takes sc_lock and disables
bottom-halves while posting all RDMA Read, Write, and Sen

svcrdma: Remove BH-disabled spin locking in svc_rdma_send()

svcrdma's current SQ accounting algorithm takes sc_lock and disables
bottom-halves while posting all RDMA Read, Write, and Send WRs.

This is relatively heavyweight serialization. And note that Write and
Send are already fully serialized by the xpt_mutex.

Using a single atomic_t should be all that is necessary to guarantee
that ib_post_send() is called only when there is enough space on the
send queue. This is what the other RDMA-enabled storage targets do.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>

show more ...


# 5fdca653 29-Nov-2016 Chuck Lever <chuck.lever@oracle.com>

svcrdma: Renovate sendto chunk list parsing

The current sendto code appears to support clients that provide only
one of a Read list, a Write list, or a Reply chunk. My reading of
tha

svcrdma: Renovate sendto chunk list parsing

The current sendto code appears to support clients that provide only
one of a Read list, a Write list, or a Reply chunk. My reading of
that code is that it doesn't support the following cases:

- Read list + Write list
- Read list + Reply chunk
- Write list + Reply chunk
- Read list + Write list + Reply chunk

The protocol allows more than one Read or Write chunk in those
lists. Some clients do send a Read list and Reply chunk
simultaneously. NFSv4 WRITE uses a Read list for the data payload,
and a Reply chunk because the GETATTR result in the reply can
contain a large object like an ACL.

Generalize one of the sendto code paths needed to support all of
the above cases, and attempt to ensure that only one pass is done
through the RPC Call's transport header to gather chunk list
information for building the reply.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>

show more ...


12345678910