Revision tags: v5.4.22, v5.4.21, v5.4.20, v5.4.19, v5.4.18, v5.4.17, v5.4.16, v5.5, v5.4.15, v5.4.14, v5.4.13 |
|
#
9912ade3 |
| 15-Jan-2020 |
Wunderlich, Mark <mark.wunderlich@intel.com> |
nvme-tcp: Set SO_PRIORITY for all host sockets
Enable ability to associate all sockets related to NVMf TCP traffic to a priority group that will perform optimized network processing for this traffic
nvme-tcp: Set SO_PRIORITY for all host sockets
Enable ability to associate all sockets related to NVMf TCP traffic to a priority group that will perform optimized network processing for this traffic class. Maintain initial default behavior of using priority of zero.
Signed-off-by: Kiran Patil <kiran.patil@intel.com> Signed-off-by: Mark Wunderlich <mark.wunderlich@intel.com> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Keith Busch <kbusch@kernel.org>
show more ...
|
#
97b2512a |
| 10-Feb-2020 |
Nigel Kirkland <nigel.kirkland@broadcom.com> |
nvme: prevent warning triggered by nvme_stop_keep_alive
Delayed keep alive work is queued on system workqueue and may be cancelled via nvme_stop_keep_alive from nvme_reset_wq, nvme_fc_wq or nvme_wq.
nvme: prevent warning triggered by nvme_stop_keep_alive
Delayed keep alive work is queued on system workqueue and may be cancelled via nvme_stop_keep_alive from nvme_reset_wq, nvme_fc_wq or nvme_wq.
Check_flush_dependency detects mismatched attributes between the work-queue context used to cancel the keep alive work and system-wq. Specifically system-wq does not have the WQ_MEM_RECLAIM flag, whereas the contexts used to cancel keep alive work have WQ_MEM_RECLAIM flag.
Example warning:
workqueue: WQ_MEM_RECLAIM nvme-reset-wq:nvme_fc_reset_ctrl_work [nvme_fc] is flushing !WQ_MEM_RECLAIM events:nvme_keep_alive_work [nvme_core]
To avoid the flags mismatch, delayed keep alive work is queued on nvme_wq.
However this creates a secondary concern where work and a request to cancel that work may be in the same work queue - namely err_work in the rdma and tcp transports, which will want to flush/cancel the keep alive work which will now be on nvme_wq.
After reviewing the transports, it looks like err_work can be moved to nvme_reset_wq. In fact that aligns them better with transition into RESETTING and performing related reset work in nvme_reset_wq.
Change nvme-rdma and nvme-tcp to perform err_work in nvme_reset_wq.
Signed-off-by: Nigel Kirkland <nigel.kirkland@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Keith Busch <kbusch@kernel.org> Signed-off-by: Jens Axboe <axboe@kernel.dk>
show more ...
|
#
2d570a7c |
| 10-Feb-2020 |
Anton Eidelman <anton@lightbitslabs.com> |
nvme/tcp: fix bug on double requeue when send fails
When nvme_tcp_io_work() fails to send to socket due to connection close/reset, error_recovery work is triggered from nvme_tcp_state_change() socke
nvme/tcp: fix bug on double requeue when send fails
When nvme_tcp_io_work() fails to send to socket due to connection close/reset, error_recovery work is triggered from nvme_tcp_state_change() socket callback. This cancels all the active requests in the tagset, which requeues them.
The failed request, however, was ended and thus requeued individually as well unless send returned -EPIPE. Another return code to be treated the same way is -ECONNRESET.
Double requeue caused BUG_ON(blk_queued_rq(rq)) in blk_mq_requeue_request() from either the individual requeue of the failed request or the bulk requeue from blk_mq_tagset_busy_iter(, nvme_cancel_request, );
Signed-off-by: Anton Eidelman <anton@lightbitslabs.com> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Keith Busch <kbusch@kernel.org> Signed-off-by: Jens Axboe <axboe@kernel.dk>
show more ...
|
Revision tags: v5.4.12, v5.4.11, v5.4.10, v5.4.9, v5.4.8, v5.4.7, v5.4.6, v5.4.5, v5.4.4, v5.4.3, v5.3.15, v5.4.2, v5.4.1, v5.3.14, v5.4, v5.3.13, v5.3.12, v5.3.11, v5.3.10, v5.3.9, v5.3.8, v5.3.7 |
|
#
58a8df67 |
| 13-Oct-2019 |
Israel Rukshin <israelr@mellanox.com> |
nvme: introduce nvme_is_aen_req function
This function improves code readability and reduces code duplication.
Signed-off-by: Israel Rukshin <israelr@mellanox.com> Signed-off-by: Max Gurtovoy <maxg
nvme: introduce nvme_is_aen_req function
This function improves code readability and reduces code duplication.
Signed-off-by: Israel Rukshin <israelr@mellanox.com> Signed-off-by: Max Gurtovoy <maxg@mellanox.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Keith Busch <kbusch@kernel.org> Signed-off-by: Jens Axboe <axboe@kernel.dk>
show more ...
|
#
3f926af3 |
| 24-Oct-2019 |
Eric Dumazet <edumazet@google.com> |
net: use skb_queue_empty_lockless() in busy poll contexts
Busy polling usually runs without locks. Let's use skb_queue_empty_lockless() instead of skb_queue_empty()
Also uses READ_ONCE() in __skb_t
net: use skb_queue_empty_lockless() in busy poll contexts
Busy polling usually runs without locks. Let's use skb_queue_empty_lockless() instead of skb_queue_empty()
Also uses READ_ONCE() in __skb_try_recv_datagram() to address a similar potential problem.
Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
show more ...
|
#
28a4cac4 |
| 13-Oct-2019 |
Max Gurtovoy <maxg@mellanox.com> |
nvme-tcp: fix possible leakage during error flow
During nvme_tcp_setup_cmd_pdu error flow, one must call nvme_cleanup_cmd since it's symmetric to nvme_setup_cmd.
Signed-off-by: Max Gurtovoy <maxg@m
nvme-tcp: fix possible leakage during error flow
During nvme_tcp_setup_cmd_pdu error flow, one must call nvme_cleanup_cmd since it's symmetric to nvme_setup_cmd.
Signed-off-by: Max Gurtovoy <maxg@mellanox.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Keith Busch <kbusch@kernel.org>
show more ...
|
Revision tags: v5.3.6 |
|
#
ac1c4e18 |
| 10-Oct-2019 |
Sebastian Andrzej Siewior <bigeasy@linutronix.de> |
nvme-tcp: Initialize sk->sk_ll_usec only with NET_RX_BUSY_POLL
The access to sk->sk_ll_usec should be hidden behind CONFIG_NET_RX_BUSY_POLL like the definition of sk_ll_usec.
Put access to ->sk_ll_
nvme-tcp: Initialize sk->sk_ll_usec only with NET_RX_BUSY_POLL
The access to sk->sk_ll_usec should be hidden behind CONFIG_NET_RX_BUSY_POLL like the definition of sk_ll_usec.
Put access to ->sk_ll_usec behind CONFIG_NET_RX_BUSY_POLL.
Fixes: 1a9460cef5711 ("nvme-tcp: support simple polling") Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Keith Busch <kbusch@kernel.org>
show more ...
|
Revision tags: v5.3.5, v5.3.4, v5.3.3, v5.3.2, v5.3.1, v5.3, v5.2.14, v5.3-rc8, v5.2.13, v5.2.12 |
|
#
92b98e88 |
| 05-Sep-2019 |
Keith Busch <kbusch@kernel.org> |
nvme: Restart request timers in resetting state
A controller in the resetting state has not yet completed its recovery actions. The pci and fc transports were already handling this, so update the re
nvme: Restart request timers in resetting state
A controller in the resetting state has not yet completed its recovery actions. The pci and fc transports were already handling this, so update the remaining transports to not attempt additional recovery in this state. Instead, just restart the request timer.
Tested-by: Edmund Nadolski <edmund.nadolski@intel.com> Reviewed-by: James Smart <james.smart@broadcom.com> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Keith Busch <kbusch@kernel.org>
show more ...
|
#
ddef2957 |
| 18-Sep-2019 |
Wunderlich, Mark <mark.wunderlich@intel.com> |
nvme-tcp: fix wrong stop condition in io_work
Allow the do/while statement to continue if current time is not after the proposed time 'deadline'. Intent is to allow loop to proceed for a specific ti
nvme-tcp: fix wrong stop condition in io_work
Allow the do/while statement to continue if current time is not after the proposed time 'deadline'. Intent is to allow loop to proceed for a specific time period. Currently the loop, as coded, will exit after first pass.
Signed-off-by: Mark Wunderlich <mark.wunderlich@intel.com> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
show more ...
|
#
312910f4 |
| 05-Sep-2019 |
Colin Ian King <colin.king@canonical.com> |
nvme: tcp: remove redundant assignment to variable ret
The variable ret is being initialized with a value that is never read and is being re-assigned immediately afterwards. The assignment is redund
nvme: tcp: remove redundant assignment to variable ret
The variable ret is being initialized with a value that is never read and is being re-assigned immediately afterwards. The assignment is redundant and hence can be removed.
Addresses-Coverity: ("Unused value") Signed-off-by: Colin Ian King <colin.king@canonical.com> Reviewed-by: Max Gurtovoy <maxg@mellanox.com> Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
show more ...
|
Revision tags: v5.2.11, v5.2.10, v5.2.9, v5.2.8, v5.2.7, v5.2.6 |
|
#
16686010 |
| 02-Aug-2019 |
Sagi Grimberg <sagi@grimberg.me> |
nvme-tcp: fail command with NVME_SC_HOST_PATH_ERROR send failed
This is a more appropriate error status for a transport error detected by us (the host).
Reviewed-by: Hannes Reinecke <hare@suse.com>
nvme-tcp: fail command with NVME_SC_HOST_PATH_ERROR send failed
This is a more appropriate error status for a transport error detected by us (the host).
Reviewed-by: Hannes Reinecke <hare@suse.com> Reviewed-by: James Smart <james.smart@broadcom.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com> Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
show more ...
|
#
e7832cb4 |
| 02-Aug-2019 |
Sagi Grimberg <sagi@grimberg.me> |
nvme: make fabrics command run on a separate request queue
We have a fundamental issue that fabric commands use the admin_q. The reason is, that admin-connect, register reads and writes and admin co
nvme: make fabrics command run on a separate request queue
We have a fundamental issue that fabric commands use the admin_q. The reason is, that admin-connect, register reads and writes and admin commands cannot be guaranteed ordering while we are running controller resets.
For example, when we reset a controller we perform: 1. disable the controller 2. teardown the admin queue 3. re-establish the admin queue 4. enable the controller
In order to perform (3), we need to unquiesce the admin queue, however we may have some admin commands that are already pending on the quiesced admin_q and will immediate execute when we unquiesce it before we execute (4). The host must not send admin commands to the controller before enabling the controller.
To fix this, we have the fabric commands (admin connect and property get/set, but not I/O queue connect) use a separate fabrics_q and make sure to quiesce the admin_q before we disable the controller, and unquiesce it only after we enable the controller.
This fixes the error prints from nvmet in a controller reset storm test: kernel: nvmet: got cmd 6 while CC.EN == 0 on qid = 0 Which indicate that the host is sending an admin command when the controller is not enabled.
Reviewed-by: James Smart <james.smart@broadcom.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
show more ...
|
#
bb13985d |
| 18-Aug-2019 |
Israel Rukshin <israelr@mellanox.com> |
nvme-tcp: Add TOS for tcp transport
TOS provide clients the ability to segregate traffic flows for different type of data. One of the TOS usage is bandwidth management which allows setting bandwidth
nvme-tcp: Add TOS for tcp transport
TOS provide clients the ability to segregate traffic flows for different type of data. One of the TOS usage is bandwidth management which allows setting bandwidth limits for QoS classes, e.g. 80% bandwidth to controllers at QoS class A and 20% to controllers at QoS class B.
usage examples: nvme connect --tos=0 --transport=tcp --traddr=10.0.1.1 --nqn=test-nvme
Signed-off-by: Israel Rukshin <israelr@mellanox.com> Reviewed-by: Max Gurtovoy <maxg@mellanox.com> Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
show more ...
|
#
9924b030 |
| 18-Aug-2019 |
Israel Rukshin <israelr@mellanox.com> |
nvme-tcp: Use struct nvme_ctrl directly
This patch doesn't change any functionality.
Signed-off-by: Israel Rukshin <israelr@mellanox.com> Reviewed-by: Max Gurtovoy <maxg@mellanox.com> Reviewed-by:
nvme-tcp: Use struct nvme_ctrl directly
This patch doesn't change any functionality.
Signed-off-by: Israel Rukshin <israelr@mellanox.com> Reviewed-by: Max Gurtovoy <maxg@mellanox.com> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
show more ...
|
Revision tags: v5.2.5, v5.2.4, v5.2.3, v5.2.2, v5.2.1, v5.2 |
|
#
1a9460ce |
| 03-Jul-2019 |
Sagi Grimberg <sagi@grimberg.me> |
nvme-tcp: support simple polling
Simple polling support via socket busy_poll interface. Although we do not shutdown interrupts but simply hammer the socket poll, we can sometimes find completions fa
nvme-tcp: support simple polling
Simple polling support via socket busy_poll interface. Although we do not shutdown interrupts but simply hammer the socket poll, we can sometimes find completions faster than the normal interrupt driven RX path.
We add per queue nr_cqe counter that resets every time RX path is invoked such that .poll callback can return it to stay consistent with the semantics.
Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
show more ...
|
#
b5b05048 |
| 22-Jul-2019 |
Sagi Grimberg <sagi@grimberg.me> |
nvme: don't pass cap to nvme_disable_ctrl
All seem to call it with ctrl->cap so no need to pass it at all.
Reviewed-by: Minwoo Im <minwoo.im.dev@gmail.com> Reviewed-by: Christoph Hellwig <hch@lst.d
nvme: don't pass cap to nvme_disable_ctrl
All seem to call it with ctrl->cap so no need to pass it at all.
Reviewed-by: Minwoo Im <minwoo.im.dev@gmail.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
show more ...
|
#
c0f2f45b |
| 22-Jul-2019 |
Sagi Grimberg <sagi@grimberg.me> |
nvme: move sqsize setting to the core
nvme_enable_ctrl reads the cap register right after, so no need to do that locally in the transport driver. Have sqsize setting in nvme_init_identify.
Reviewed
nvme: move sqsize setting to the core
nvme_enable_ctrl reads the cap register right after, so no need to do that locally in the transport driver. Have sqsize setting in nvme_init_identify.
Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
show more ...
|
#
10407ec9 |
| 08-Jul-2019 |
Potnuri Bharat Teja <bharat@chelsio.com> |
nvme-tcp: Use protocol specific operations while reading socket
Using socket specific read_sock() calls instead of directly calling tcp_read_sock() helps lld module registered handlers if any, to be
nvme-tcp: Use protocol specific operations while reading socket
Using socket specific read_sock() calls instead of directly calling tcp_read_sock() helps lld module registered handlers if any, to be called from nvme-tcp host. This patch therefore replaces the tcp_read_sock() with socket specific prot_ops.
Signed-off-by: Potnuri Bharat Teja <bharat@chelsio.com> Acked-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
show more ...
|
#
6be18260 |
| 19-Jul-2019 |
Sagi Grimberg <sagi@grimberg.me> |
nvme-tcp: cleanup nvme_tcp_recv_pdu
Can return directly in the switch statement
Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
|
#
622b8b68 |
| 23-Jul-2019 |
Ming Lei <ming.lei@redhat.com> |
nvme: wait until all completed request's complete fn is called
When aborting in-flight request for recovering controller, we have to make sure that queue's complete function is called on completed r
nvme: wait until all completed request's complete fn is called
When aborting in-flight request for recovering controller, we have to make sure that queue's complete function is called on completed request before moving on. Otherwise, for example, the warning of WARN_ON_ONCE(qp->mrs_used > 0) in ib_destroy_qp_user() may be triggered on nvme-rdma.
Fix this issue by using blk_mq_tagset_wait_completed_request.
Cc: Max Gurtovoy <maxg@mellanox.com> Cc: Sagi Grimberg <sagi@grimberg.me> Cc: Keith Busch <keith.busch@intel.com> Cc: Christoph Hellwig <hch@lst.de> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Ming Lei <ming.lei@redhat.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
show more ...
|
#
37c15219 |
| 08-Jul-2019 |
Mikhail Skorzhinskii <mskorzhinskiy@solarflare.com> |
nvme-tcp: don't use sendpage for SLAB pages
According to commit a10674bf2406 ("tcp: detecting the misuse of .sendpage for Slab objects") and previous discussion, tcp_sendpage should not be used for
nvme-tcp: don't use sendpage for SLAB pages
According to commit a10674bf2406 ("tcp: detecting the misuse of .sendpage for Slab objects") and previous discussion, tcp_sendpage should not be used for pages that is managed by SLAB, as SLAB is not taking page reference counters into consideration.
Signed-off-by: Mikhail Skorzhinskii <mskorzhinskiy@solarflare.com> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Christoph Hellwig <hch@lst.de>
show more ...
|
Revision tags: v5.1.16, v5.1.15, v5.1.14, v5.1.13, v5.1.12, v5.1.11, v5.1.10, v5.1.9, v5.1.8, v5.1.7, v5.1.6 |
|
#
64861993 |
| 29-May-2019 |
Sagi Grimberg <sagi@grimberg.me> |
nvme-tcp: fix queue mapping when queue count is limited
When the controller supports less queues than requested, we should make sure that queue mapping does the right thing and not assume that all q
nvme-tcp: fix queue mapping when queue count is limited
When the controller supports less queues than requested, we should make sure that queue mapping does the right thing and not assume that all queues are available. This fixes a crash when the controller supports less queues than requested.
The rules are: 1. if no write queues are requested, we assign the available queues to the default queue map. The default and read queue maps share the existing queues. 2. if write queues are requested: - first make sure that read queue map gets the requested nr_io_queues count - then grant the default queue map the minimum between the requested nr_write_queues and the remaining queues. If there are no available queues to dedicate to the default queue map, fallback to (1) and share all the queues in the existing queue map.
Also, provide a log indication on how we constructed the different queue maps.
Reported-by: Harris, James R <james.r.harris@intel.com> Tested-by: Jim Harris <james.r.harris@intel.com> Cc: <stable@vger.kernel.org> # v5.0+ Suggested-by: Roy Shterman <roys@lightbitslabs.com> Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
show more ...
|
Revision tags: v5.1.5, v5.1.4, v5.1.3, v5.1.2, v5.1.1, v5.0.14, v5.1, v5.0.13, v5.0.12, v5.0.11 |
|
#
f34e2589 |
| 29-Apr-2019 |
Sagi Grimberg <sagi@grimberg.me> |
nvme-tcp: fix possible null deref on a timed out io queue connect
If I/O queue connect times out, we might have freed the queue socket already, so check for that on the error path in nvme_tcp_start_
nvme-tcp: fix possible null deref on a timed out io queue connect
If I/O queue connect times out, we might have freed the queue socket already, so check for that on the error path in nvme_tcp_start_queue.
Signed-off-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Christoph Hellwig <hch@lst.de>
show more ...
|
Revision tags: v5.0.10 |
|
#
efb973b1 |
| 24-Apr-2019 |
Sagi Grimberg <sagi@grimberg.me> |
nvme-tcp: rename function to have nvme_tcp prefix
usually nvme_ prefix is for core functions. While we're cleaning up, remove redundant empty lines
Signed-off-by: Sagi Grimberg <sagi@grimberg.me> R
nvme-tcp: rename function to have nvme_tcp prefix
usually nvme_ prefix is for core functions. While we're cleaning up, remove redundant empty lines
Signed-off-by: Sagi Grimberg <sagi@grimberg.me> Reviewed-by: Minwoo Im <minwoo.im@samsung.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
show more ...
|
#
7a425896 |
| 24-Apr-2019 |
Sagi Grimberg <sagi@grimberg.me> |
nvme-tcp: fix a NULL deref when an admin connect times out
If we timeout the admin startup sequence we might not yet have an I/O tagset allocated which causes the teardown sequence to crash. Make nv
nvme-tcp: fix a NULL deref when an admin connect times out
If we timeout the admin startup sequence we might not yet have an I/O tagset allocated which causes the teardown sequence to crash. Make nvme_tcp_teardown_io_queues safe by not iterating inflight tags if the tagset wasn't allocated.
Fixes: 39d57757467b ("nvme-tcp: fix timeout handler") Signed-off-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Christoph Hellwig <hch@lst.de>
show more ...
|