History log of /openbmc/linux/net/sunrpc/xprtrdma/svc_rdma_sendto.c (Results 1 – 25 of 246)
Revision (<<< Hide revision tags) (Show revision tags >>>) Date Author Comments
Revision tags: v6.6.25, v6.6.24, v6.6.23, v6.6.16, v6.6.15, v6.6.14, v6.6.13, v6.6.12, v6.6.11, v6.6.10, v6.6.9, v6.6.8, v6.6.7, v6.6.6, v6.6.5, v6.6.4, v6.6.3, v6.6.2, v6.5.11, v6.6.1, v6.5.10, v6.6, v6.5.9, v6.5.8, v6.5.7, v6.5.6, v6.5.5, v6.5.4, v6.5.3, v6.5.2, v6.1.51, v6.5.1, v6.1.50, v6.5, v6.1.49, v6.1.48, v6.1.46, v6.1.45, v6.1.44, v6.1.43, v6.1.42, v6.1.41, v6.1.40, v6.1.39, v6.1.38, v6.1.37, v6.1.36, v6.4, v6.1.35, v6.1.34
# 91f8ce28 12-Jun-2023 Chuck Lever <chuck.lever@oracle.com>

svcrdma: Convert "might sleep" comment into a code annotation

Try to catch incorrect calling contexts mechanically rather than by
code review.

Reviewed-by: Jeff Layton <jlayton@kernel.org>
Acked-by

svcrdma: Convert "might sleep" comment into a code annotation

Try to catch incorrect calling contexts mechanically rather than by
code review.

Reviewed-by: Jeff Layton <jlayton@kernel.org>
Acked-by: Tom Talpey <tom@talpey.com>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>

show more ...


# 5581cf8e 12-Jun-2023 Chuck Lever <chuck.lever@oracle.com>

SUNRPC: Optimize page release in svc_rdma_sendto()

Now that we have bulk page allocation and release APIs, it's more
efficient to use those than it is for nfsd threads to wait for send
completions.

SUNRPC: Optimize page release in svc_rdma_sendto()

Now that we have bulk page allocation and release APIs, it's more
efficient to use those than it is for nfsd threads to wait for send
completions. Previous patches have eliminated the calls to
wait_for_completion() and complete(), in order to avoid scheduler
overhead.

Now release pages-under-I/O in the send completion handler using
the efficient bulk release API.

I've measured a 7% reduction in cumulative CPU utilization in
svc_rdma_sendto(), svc_rdma_wc_send(), and svc_xprt_release(). In
particular, using release_pages() instead of complete() cuts the
time per svc_rdma_wc_send() call by two-thirds. This helps improve
scalability because svc_rdma_wc_send() is single-threaded per
connection.

Reviewed-by: Tom Talpey <tom@talpey.com>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>

show more ...


# c4b50cdf 12-Jun-2023 Chuck Lever <chuck.lever@oracle.com>

svcrdma: Revert 2a1e4f21d841 ("svcrdma: Normalize Send page handling")

Get rid of the completion wait in svc_rdma_sendto(), and release
pages in the send completion handler again. A subsequent patch

svcrdma: Revert 2a1e4f21d841 ("svcrdma: Normalize Send page handling")

Get rid of the completion wait in svc_rdma_sendto(), and release
pages in the send completion handler again. A subsequent patch will
handle releasing those pages more efficiently.

Reverted by hand: patch -R would not apply cleanly.

Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>

show more ...


# a944209c 12-Jun-2023 Chuck Lever <chuck.lever@oracle.com>

SUNRPC: Revert 579900670ac7 ("svcrdma: Remove unused sc_pages field")

Pre-requisite for releasing pages in the send completion handler.
Reverted by hand: patch -R would not apply cleanly.

Reviewed-

SUNRPC: Revert 579900670ac7 ("svcrdma: Remove unused sc_pages field")

Pre-requisite for releasing pages in the send completion handler.
Reverted by hand: patch -R would not apply cleanly.

Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>

show more ...


# 6be7afcd 12-Jun-2023 Chuck Lever <chuck.lever@oracle.com>

SUNRPC: Revert cc93ce9529a6 ("svcrdma: Retain the page backing rq_res.head[0].iov_base")

Pre-requisite for releasing pages in the send completion handler.
Reverted by hand: patch -R would not apply

SUNRPC: Revert cc93ce9529a6 ("svcrdma: Retain the page backing rq_res.head[0].iov_base")

Pre-requisite for releasing pages in the send completion handler.
Reverted by hand: patch -R would not apply cleanly.

Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>

show more ...


Revision tags: v6.1.33
# ed51b426 05-Jun-2023 Chuck Lever <chuck.lever@oracle.com>

svcrdma: Clean up allocation of svc_rdma_send_ctxt

The physical device's favored NUMA node ID is available when
allocating a send_ctxt. Use that value instead of relying on the
assumption that the m

svcrdma: Clean up allocation of svc_rdma_send_ctxt

The physical device's favored NUMA node ID is available when
allocating a send_ctxt. Use that value instead of relying on the
assumption that the memory allocation happens to be running on a
node close to the device.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>

show more ...


Revision tags: v6.1.32, v6.1.31, v6.1.30, v6.1.29, v6.1.28, v6.1.27, v6.1.26, v6.3, v6.1.25, v6.1.24, v6.1.23, v6.1.22, v6.1.21, v6.1.20, v6.1.19, v6.1.18, v6.1.17, v6.1.16, v6.1.15, v6.1.14, v6.1.13, v6.2, v6.1.12, v6.1.11, v6.1.10, v6.1.9, v6.1.8, v6.1.7, v6.1.6, v6.1.5, v6.0.19, v6.0.18, v6.1.4, v6.1.3, v6.0.17, v6.1.2, v6.0.16, v6.1.1, v6.0.15, v6.0.14, v6.0.13, v6.1, v6.0.12, v6.0.11, v6.0.10, v5.15.80, v6.0.9, v5.15.79, v6.0.8, v5.15.78, v6.0.7, v5.15.77, v5.15.76, v6.0.6, v6.0.5, v5.15.75, v6.0.4, v6.0.3, v6.0.2, v5.15.74, v5.15.73, v6.0.1, v5.15.72, v6.0, v5.15.71, v5.15.70, v5.15.69, v5.15.68, v5.15.67, v5.15.66, v5.15.65, v5.15.64, v5.15.63, v5.15.62, v5.15.61, v5.15.60, v5.15.59, v5.19, v5.15.58, v5.15.57, v5.15.56, v5.15.55, v5.15.54, v5.15.53, v5.15.52, v5.15.51, v5.15.50, v5.15.49, v5.15.48, v5.15.47, v5.15.46, v5.15.45, v5.15.44, v5.15.43, v5.15.42, v5.18, v5.15.41, v5.15.40, v5.15.39, v5.15.38, v5.15.37, v5.15.36, v5.15.35, v5.15.34, v5.15.33, v5.15.32, v5.15.31, v5.17, v5.15.30, v5.15.29, v5.15.28, v5.15.27, v5.15.26, v5.15.25, v5.15.24, v5.15.23, v5.15.22, v5.15.21, v5.15.20, v5.15.19, v5.15.18, v5.15.17, v5.4.173, v5.15.16, v5.15.15, v5.16, v5.15.10, v5.15.9, v5.15.8, v5.15.7, v5.15.6, v5.15.5, v5.15.4, v5.15.3, v5.15.2, v5.15.1, v5.15, v5.14.14, v5.14.13, v5.14.12, v5.14.11, v5.14.10
# eef2d8d4 04-Oct-2021 Chuck Lever <chuck.lever@oracle.com>

svcrdma: Split the svcrdma_wc_send() tracepoint

There are currently three separate purposes being served by a single
tracepoint here. They need to be split up.

svcrdma_wc_send:
- status is always

svcrdma: Split the svcrdma_wc_send() tracepoint

There are currently three separate purposes being served by a single
tracepoint here. They need to be split up.

svcrdma_wc_send:
- status is always zero, so there's no value in recording it.
- vendor_err is meaningless unless status is not zero, so
there's no value in recording it.
- This tracepoint is needed only when developing modifications,
so it should be left disabled most of the time.

svcrdma_wc_send_flush:
- As above, needed only rarely, and not an error.

svcrdma_wc_send_err:
- This tracepoint can be left persistently enabled because
completion errors are run-time problems (except for FLUSHED_ERR).
- Tracepoint name now ends in _err to reflect its purpose.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>

show more ...


Revision tags: v5.14.9, v5.14.8, v5.14.7, v5.14.6, v5.10.67, v5.10.66, v5.14.5, v5.14.4, v5.10.65, v5.14.3, v5.10.64, v5.14.2, v5.10.63, v5.14.1, v5.10.62, v5.14, v5.10.61, v5.10.60, v5.10.53, v5.10.52, v5.10.51, v5.10.50, v5.10.49, v5.13, v5.10.46, v5.10.43, v5.10.42, v5.10.41, v5.10.40, v5.10.39, v5.4.119, v5.10.36, v5.10.35, v5.10.34, v5.4.116, v5.10.33, v5.12, v5.10.32, v5.10.31, v5.10.30, v5.10.27, v5.10.26, v5.10.25, v5.10.24, v5.10.23, v5.10.22, v5.10.21, v5.10.20, v5.10.19, v5.4.101, v5.10.18, v5.10.17, v5.11, v5.10.16, v5.10.15
# b6c2bfea 09-Feb-2021 Chuck Lever <chuck.lever@oracle.com>

svcrdma: Relieve contention on sc_send_lock.

/proc/lock_stat indicates the the sc_send_lock is heavily
contended when the server is under load from a single client.

To address this, convert the sen

svcrdma: Relieve contention on sc_send_lock.

/proc/lock_stat indicates the the sc_send_lock is heavily
contended when the server is under load from a single client.

To address this, convert the send_ctxt free list to an llist.
Returning an item to the send_ctxt cache is now waitless, which
reduces the instruction path length in the single-threaded Send
handler (svc_rdma_wc_send).

The goal is to enable the ib_comp_wq worker to handle a higher
RPC/RDMA Send completion rate given the same CPU resources. This
change reduces CPU utilization of Send completion by 2-3% on my
server.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Reviewed-By: Tom Talpey <tom@talpey.com>

show more ...


# 6c8c84f5 07-Jul-2021 Chuck Lever <chuck.lever@oracle.com>

svcrdma: Fewer calls to wake_up() in Send completion handler

Because wake_up() takes an IRQ-safe lock, it can be expensive,
especially to call inside of a single-threaded completion handler.
What's

svcrdma: Fewer calls to wake_up() in Send completion handler

Because wake_up() takes an IRQ-safe lock, it can be expensive,
especially to call inside of a single-threaded completion handler.
What's more, the Send wait queue almost never has waiters, so
most of the time, this is an expensive no-op.

As always, the goal is to reduce the average overhead of each
completion, because a transport's completion handlers are single-
threaded on one CPU core. This change reduces CPU utilization of
the Send completion thread by 2-3% on my server.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Reviewed-By: Tom Talpey <tom@talpey.com>

show more ...


# 8727f788 11-Apr-2021 Chuck Lever <chuck.lever@oracle.com>

svcrdma: Pass a useful error code to the send_err tracepoint

Capture error codes in @ret, which is passed to the send_err
tracepoint, so that they can be logged when something goes awry.

Signed-off

svcrdma: Pass a useful error code to the send_err tracepoint

Capture error codes in @ret, which is passed to the send_err
tracepoint, so that they can be logged when something goes awry.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>

show more ...


# c7731d5e 13-Apr-2021 Chuck Lever <chuck.lever@oracle.com>

svcrdma: Rename goto labels in svc_rdma_sendto()

Clean up: Make the goto labels consistent with other similar
functions.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>


# 351461f3 13-Apr-2021 Chuck Lever <chuck.lever@oracle.com>

svcrdma: Don't leak send_ctxt on Send errors

Address a rare send_ctxt leak in the svc_rdma_sendto() error paths.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>


Revision tags: v5.10.14
# cc93ce95 01-Feb-2021 Chuck Lever <chuck.lever@oracle.com>

svcrdma: Retain the page backing rq_res.head[0].iov_base

svc_rdma_sendto() now waits for the NIC hardware to finish with
the pages backing rq_res. We still have to release the page array
in some cas

svcrdma: Retain the page backing rq_res.head[0].iov_base

svc_rdma_sendto() now waits for the NIC hardware to finish with
the pages backing rq_res. We still have to release the page array
in some cases, but now it's always safe to immediately re-use the
page backing rq_res's head buffer.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>

show more ...


# 57990067 28-Jan-2021 Chuck Lever <chuck.lever@oracle.com>

svcrdma: Remove unused sc_pages field

Clean up. This significantly reduces the size of struct
svc_rdma_send_ctxt.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>


# 2a1e4f21 13-Jan-2021 Chuck Lever <chuck.lever@oracle.com>

svcrdma: Normalize Send page handling

Currently svc_rdma_sendto() migrates xdr_buf pages into a separate
page list and NULLs out a bunch of entries in rq_pages while the
pages are under I/O. The Sen

svcrdma: Normalize Send page handling

Currently svc_rdma_sendto() migrates xdr_buf pages into a separate
page list and NULLs out a bunch of entries in rq_pages while the
pages are under I/O. The Send completion handler then frees those
pages later.

Instead, let's wait for the Send completion, then handle page
releasing in the nfsd thread. I'd like to avoid the cost of 250+
put_page() calls in the Send completion handler, which is single-
threaded.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>

show more ...


# e844d307 20-Feb-2021 Chuck Lever <chuck.lever@oracle.com>

svcrdma: Add a "deferred close" helper

Refactor a bit of commonly used logic so that every site that wants
a close deferred to an nfsd thread does all the right things
(set_bit(XPT_CLOSE) then enque

svcrdma: Add a "deferred close" helper

Refactor a bit of commonly used logic so that every site that wants
a close deferred to an nfsd thread does all the right things
(set_bit(XPT_CLOSE) then enqueue).

Also, once XPT_CLOSE is set on a transport, it is never cleared. If
XPT_CLOSE is already set, then the close is already being handled
and the enqueue can be skipped.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>

show more ...


# 072db263 20-Feb-2021 Chuck Lever <chuck.lever@oracle.com>

svcrdma: RPCDBG_FACILITY is no longer used

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>


# 22df5a22 29-Dec-2020 Chuck Lever <chuck.lever@oracle.com>

svcrdma: Convert rdma_stat_sq_starve to a per-CPU counter

Avoid the overhead of a memory bus lock cycle for counting a value
that is hardly every used.

Signed-off-by: Chuck Lever <chuck.lever@oracl

svcrdma: Convert rdma_stat_sq_starve to a per-CPU counter

Avoid the overhead of a memory bus lock cycle for counting a value
that is hardly every used.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>

show more ...


Revision tags: v5.10, v5.8.17, v5.8.16, v5.8.15, v5.9, v5.8.14, v5.8.13, v5.8.12, v5.8.11, v5.8.10, v5.8.9, v5.8.8, v5.8.7, v5.8.6, v5.4.62, v5.8.5, v5.8.4, v5.4.61, v5.8.3, v5.4.60, v5.8.2, v5.4.59, v5.8.1, v5.4.58, v5.4.57, v5.4.56, v5.8, v5.7.12, v5.4.55, v5.7.11, v5.4.54, v5.7.10, v5.4.53, v5.4.52, v5.7.9, v5.7.8, v5.4.51, v5.4.50, v5.7.7, v5.4.49, v5.7.6, v5.7.5, v5.4.48, v5.7.4, v5.7.3, v5.4.47
# b704be09 11-Jun-2020 Chuck Lever <chuck.lever@oracle.com>

svcrdma: Clean up chunk tracepoints

We already have trace_svcrdma_decode_rseg(), which records each
ingress Read segment. Instead of reporting those again when they
are about to be posted as RDMA Re

svcrdma: Clean up chunk tracepoints

We already have trace_svcrdma_decode_rseg(), which records each
ingress Read segment. Instead of reporting those again when they
are about to be posted as RDMA Reads, let's fire one tracepoint
before posting each type of chunk.

So we'll get:

nfsd-1998 [002] 321.666615: svcrdma_decode_rseg: cq.id=4 cid=42 segno=0 position=0 192@0x013ca9ebfae14000:0xb0010b05
nfsd-1998 [002] 321.666615: svcrdma_decode_rseg: cq.id=4 cid=42 segno=1 position=0 7688@0x013ca9ebf914e000:0xb0010a05
nfsd-1998 [002] 321.666615: svcrdma_decode_rseg: cq.id=4 cid=42 segno=2 position=0 28@0x013ca9ebfae15000:0xb0010905
nfsd-1998 [002] 321.666622: svcrdma_decode_rqst: cq.id=4 cid=42 xid=0x013ca9eb vers=1 credits=128 proc=RDMA_NOMSG hdrlen=100

nfsd-1998 [002] 321.666642: svcrdma_post_read_chunk: cq.id=3 cid=112 sqecount=3

kworker/2:1H-221 [002] 321.673949: svcrdma_wc_read: cq.id=3 cid=112 status=SUCCESS (0/0x0)

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>

show more ...


Revision tags: v5.4.46, v5.7.2, v5.4.45, v5.7.1, v5.4.44, v5.7, v5.4.43, v5.4.42, v5.4.41, v5.4.40, v5.4.39, v5.4.38, v5.4.37, v5.4.36, v5.4.35, v5.4.34, v5.4.33, v5.4.32, v5.4.31, v5.4.30, v5.4.29, v5.6, v5.4.28, v5.4.27, v5.4.26, v5.4.25
# 2371bcc0 09-Mar-2020 Chuck Lever <chuck.lever@oracle.com>

svcrdma: Support multiple Write chunks in svc_rdma_map_reply_msg()

Refactor: svc_rdma_map_reply_msg() is restructured to DMA map only
the parts of rq_res that do not contain a result payload.

This

svcrdma: Support multiple Write chunks in svc_rdma_map_reply_msg()

Refactor: svc_rdma_map_reply_msg() is restructured to DMA map only
the parts of rq_res that do not contain a result payload.

This change has been tested to confirm that it does not cause a
regression in the no Write chunk and single Write chunk cases.
Multiple Write chunks have not been tested.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>

show more ...


# 9d0b09d5 13-Mar-2020 Chuck Lever <chuck.lever@oracle.com>

svcrdma: Support multiple write chunks when pulling up

When counting the number of SGEs needed to construct a Send request,
do not count result payloads. And, when copying the Reply message
into the

svcrdma: Support multiple write chunks when pulling up

When counting the number of SGEs needed to construct a Send request,
do not count result payloads. And, when copying the Reply message
into the pull-up buffer, result payloads are not to be copied to the
Send buffer.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>

show more ...


# 6911f3e1 17-Jun-2020 Chuck Lever <chuck.lever@oracle.com>

svcrdma: Use parsed chunk lists to encode Reply transport headers

Refactor: Instead of re-parsing the ingress RPC Call transport
header when constructing the egress RPC Reply transport header, use
t

svcrdma: Use parsed chunk lists to encode Reply transport headers

Refactor: Instead of re-parsing the ingress RPC Call transport
header when constructing the egress RPC Reply transport header, use
the new parsed Write list and Reply chunk, which are version-
agnostic and already XDR decoded.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>

show more ...


# 7a1cbfa1 17-Jun-2020 Chuck Lever <chuck.lever@oracle.com>

svcrdma: Use parsed chunk lists to construct RDMA Writes

Refactor: Instead of re-parsing the ingress RPC Call transport
header when constructing RDMA Writes, use the new parsed chunk lists
for the W

svcrdma: Use parsed chunk lists to construct RDMA Writes

Refactor: Instead of re-parsing the ingress RPC Call transport
header when constructing RDMA Writes, use the new parsed chunk lists
for the Write list and Reply chunk, which are version-agnostic and
already XDR-decoded.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>

show more ...


# ded380f1 13-Mar-2020 Chuck Lever <chuck.lever@oracle.com>

svcrdma: Clean up svc_rdma_encode_reply_chunk()

Refactor: Match the control flow of svc_rdma_encode_write_list().

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>


# f6ad7759 13-Mar-2020 Chuck Lever <chuck.lever@oracle.com>

svcrdma: Post RDMA Writes while XDR encoding replies

The only RPC/RDMA ordering requirement between RDMA Writes and RDMA
Sends is that the responder must post the Writes on the Send queue
before pos

svcrdma: Post RDMA Writes while XDR encoding replies

The only RPC/RDMA ordering requirement between RDMA Writes and RDMA
Sends is that the responder must post the Writes on the Send queue
before posting the Send that conveys the RPC Reply for that Write
payload.

The Linux NFS server implementation now has a transport method that
can post result Payload Writes earlier than svc_rdma_sendto:

->xpo_result_payload()

This gets RDMA Writes going earlier so they are more likely to be
complete at the remote end before the Send completes.

Some care must be taken with pulled-up Replies. We don't want to
push the Write chunk and then send the same payload data via Send.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>

show more ...


12345678910