#
06427ca0 |
| 20-Sep-2022 |
Christoph Hellwig <hch@lst.de> |
nvme-tcp: store the generic nvme_ctrl in set->driver_data
Point the private data to the generic controller structure in preparation of using the common tagset init/exit code.
Signed-off-by: Christo
nvme-tcp: store the generic nvme_ctrl in set->driver_data
Point the private data to the generic controller structure in preparation of using the common tagset init/exit code.
Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com>
show more ...
|
#
fb8745d0 |
| 20-Sep-2022 |
Christoph Hellwig <hch@lst.de> |
nvme-tcp: remove the unused queue_size member in nvme_tcp_queue
->nvme_tcp_queue is not used anywhere, so remove it.
Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Sagi Grimberg <sagi@g
nvme-tcp: remove the unused queue_size member in nvme_tcp_queue
->nvme_tcp_queue is not used anywhere, so remove it.
Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com>
show more ...
|
Revision tags: v5.15.68, v5.15.67, v5.15.66 |
|
#
02c57a82 |
| 07-Sep-2022 |
Martin Belanger <martin.belanger@dell.com> |
nvme-tcp: print actual source IP address through sysfs "address" attr
TCP transport relies on the routing table to determine which source address and interface to use when making a connection. Curre
nvme-tcp: print actual source IP address through sysfs "address" attr
TCP transport relies on the routing table to determine which source address and interface to use when making a connection. Currently, there is no way to tell from userspace where a connection was made. This patch exposes the actual source address using a new field named "src_addr=" in the "address" attribute.
This is needed to diagnose and identify connectivity issues. With the source address we can infer the interface associated with each connection.
This was tested with nvme-cli 2.0 to verify it does not have any adverse effect. The new "src_addr=" field will simply be displayed in the output of the "list-subsys" or "list -v" commands as shown here.
$ nvme list-subsys nvme-subsys0 - NQN=nqn.2014-08.org.nvmexpress.discovery \ +- nvme0 tcp traddr=192.168.56.1,trsvcid=8009,src_addr=192.168.56.101 live
Signed-off-by: Martin Belanger <martin.belanger@dell.com> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
show more ...
|
Revision tags: v5.15.65, v5.15.64 |
|
#
09035f86 |
| 29-Aug-2022 |
Daniel Wagner <dwagner@suse.de> |
nvme-tcp: handle number of queue changes
On reconnect, the number of queues might have changed.
In the case where we have more queues available than previously we try to access queues which are not
nvme-tcp: handle number of queue changes
On reconnect, the number of queues might have changed.
In the case where we have more queues available than previously we try to access queues which are not initialized yet.
The other case where we have less queues than previously, the connection attempt will fail because the target doesn't support the old number of queues and we end up in a reconnect loop.
Thus, only start queues which are currently present in the tagset limited by the number of available queues. Then we update the tagset and we can start any new queue.
Signed-off-by: Daniel Wagner <dwagner@suse.de> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Reviewed-by: Hannes Reinecke <hare@suse.de> Signed-off-by: Christoph Hellwig <hch@lst.de>
show more ...
|
#
3770a42b |
| 05-Sep-2022 |
Sagi Grimberg <sagi@grimberg.me> |
nvme-tcp: fix regression that causes sporadic requests to time out
When we queue requests, we strive to batch as much as possible and also signal the network stack that more data is about to be sent
nvme-tcp: fix regression that causes sporadic requests to time out
When we queue requests, we strive to batch as much as possible and also signal the network stack that more data is about to be sent over a socket with MSG_SENDPAGE_NOTLAST. This flag looks at the pending requests queued as well as queue->more_requests that is derived from the block layer last-in-batch indication.
We set more_request=true when we flush the request directly from .queue_rq submission context (in nvme_tcp_send_all), however this is wrongly assuming that no other requests may be queued during the execution of nvme_tcp_send_all.
Due to this, a race condition may happen where:
1. request X is queued as !last-in-batch 2. request X submission context calls nvme_tcp_send_all directly 3. nvme_tcp_send_all is preempted and schedules to a different cpu 4. request Y is queued as last-in-batch 5. nvme_tcp_send_all context sends request X+Y, however signals for both MSG_SENDPAGE_NOTLAST because queue->more_requests=true.
==> none of the requests is pushed down to the wire as the network stack is waiting for more data, both requests timeout.
To fix this, we eliminate queue->more_requests and only rely on the queue req_list and send_list to be not-empty.
Fixes: 122e5b9f3d37 ("nvme-tcp: optimize network stack with setting msg flags according to batch size") Reported-by: Jonathan Nicklin <jnicklin@blockbridge.com> Signed-off-by: Sagi Grimberg <sagi@grimberg.me> Tested-by: Jonathan Nicklin <jnicklin@blockbridge.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
show more ...
|
#
160f3549 |
| 05-Sep-2022 |
Sagi Grimberg <sagi@grimberg.me> |
nvme-tcp: fix UAF when detecting digest errors
We should also bail from the io_work loop when we set rd_enabled to true, so we don't attempt to read data from the socket when the TCP stream is alrea
nvme-tcp: fix UAF when detecting digest errors
We should also bail from the io_work loop when we set rd_enabled to true, so we don't attempt to read data from the socket when the TCP stream is already out-of-sync or corrupted.
Fixes: 3f2304f8c6d6 ("nvme-tcp: add NVMe over TCP host driver") Reported-by: Daniel Wagner <dwagner@suse.de> Signed-off-by: Sagi Grimberg <sagi@grimberg.me> Reviewed-by: Daniel Wagner <dwagner@suse.de> Signed-off-by: Christoph Hellwig <hch@lst.de>
show more ...
|
Revision tags: v5.15.63, v5.15.62, v5.15.61 |
|
#
a4e1d0b7 |
| 15-Aug-2022 |
Bart Van Assche <bvanassche@acm.org> |
block: Change the return type of blk_mq_map_queues() into void
Since blk_mq_map_queues() and the .map_queues() callbacks always return 0, change their return type into void. Most callers ignore the
block: Change the return type of blk_mq_map_queues() into void
Since blk_mq_map_queues() and the .map_queues() callbacks always return 0, change their return type into void. Most callers ignore the returned value anyway.
Cc: Christoph Hellwig <hch@lst.de> Cc: Jason Wang <jasowang@redhat.com> Cc: Keith Busch <kbusch@kernel.org> Cc: Martin K. Petersen <martin.petersen@oracle.com> Cc: Doug Gilbert <dgilbert@interlog.com> Cc: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Bart Van Assche <bvanassche@acm.org> Reviewed-by: John Garry <john.garry@huawei.com> Acked-by: Md Haris Iqbal <haris.iqbal@ionos.com> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Link: https://lore.kernel.org/r/20220815170043.19489-3-bvanassche@acm.org [axboe: fold in fix from Bart] Signed-off-by: Jens Axboe <axboe@kernel.dk>
show more ...
|
Revision tags: v5.15.60, v5.15.59 |
|
#
2bff487f |
| 01-Aug-2022 |
Maurizio Lombardi <mlombard@redhat.com> |
nvme-tcp: check if the queue is allocated before stopping it
When an error is detected and the host reconnects, the nvme_tcp_error_recovery_work() function is called and starts tearing down the io q
nvme-tcp: check if the queue is allocated before stopping it
When an error is detected and the host reconnects, the nvme_tcp_error_recovery_work() function is called and starts tearing down the io queues and de-allocating them; If at the same time the "nvme" process deletes the controller via sysfs, the nvme_tcp_delete_ctrl() gets called and waits until the nvme_tcp_error_recovery_work() finishes its job; then starts tearing down the io queues, but at this point they have already been freed and the mutexes are destroyed.
Calling mutex_lock() against a destroyed mutex triggers a warning:
[ 1299.025575] nvme nvme1: Reconnecting in 10 seconds... [ 1299.636449] nvme nvme1: Removing ctrl: NQN "blktests-subsystem-1" [ 1299.645262] ------------[ cut here ]------------ [ 1299.649949] DEBUG_LOCKS_WARN_ON(lock->magic != lock) [ 1299.649971] WARNING: CPU: 4 PID: 104150 at kernel/locking/mutex.c:579 __mutex_lock+0x2d0/0x7dc
[ 1299.717934] CPU: 4 PID: 104150 Comm: nvme [ 1299.828075] Call trace: [ 1299.830526] __mutex_lock+0x2d0/0x7dc [ 1299.834203] mutex_lock_nested+0x64/0xd4 [ 1299.838139] nvme_tcp_stop_queue+0x54/0xe0 [nvme_tcp] [ 1299.843211] nvme_tcp_teardown_io_queues.part.0+0x90/0x280 [nvme_tcp] [ 1299.849672] nvme_tcp_delete_ctrl+0x6c/0xf0 [nvme_tcp] [ 1299.854831] nvme_do_delete_ctrl+0x108/0x120 [nvme_core] [ 1299.860181] nvme_sysfs_delete+0xec/0xf0 [nvme_core] [ 1299.865179] dev_attr_store+0x40/0x70
Fix the warning by checking if the queues are allocated in the nvme_tcp_stop_queue(). If they are not, it makes no sense to try to stop them.
Signed-off-by: Maurizio Lombardi <mlombard@redhat.com> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Christoph Hellwig <hch@lst.de>
show more ...
|
Revision tags: v5.19, v5.15.58, v5.15.57, v5.15.56 |
|
#
2f7a7e5d |
| 21-Jul-2022 |
Christoph Hellwig <hch@lst.de> |
nvme-tcp: split nvme_tcp_alloc_tagset
Split nvme_tcp_alloc_tagset into one helper for the admin tag_set and one for the I/O tag set.
Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Keith
nvme-tcp: split nvme_tcp_alloc_tagset
Split nvme_tcp_alloc_tagset into one helper for the admin tag_set and one for the I/O tag set.
Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Keith Busch <kbusch@kernel.org> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Jens Axboe <axboe@kernel.dk>
show more ...
|
Revision tags: v5.15.55, v5.15.54 |
|
#
53ee9e29 |
| 07-Jul-2022 |
Caleb Sander <csander@purestorage.com> |
nvme-tcp: use in-capsule data for I/O connect
Currently, command data is only sent in-capsule on the for admin or I/O commands on queues that indicate support for it. Send fabrics command data in-c
nvme-tcp: use in-capsule data for I/O connect
Currently, command data is only sent in-capsule on the for admin or I/O commands on queues that indicate support for it. Send fabrics command data in-capsule for I/O queues as well to avoid needing a separate H2CData PDU for the connect command.
This is optimization. Without this change, we send the connect command capsule and data in separate PDUs (CapsuleCmd and H2CData), and must wait for the controller to respond with an R2T PDU before sending the H2CData.
With the change, we send a single CapsuleCmd PDU that includes the data. This reduces the number of bytes (and likely packets) sent across the network, and simplifies the send state machine handling in the driver.
Signed-off-by: Caleb Sander <csander@purestorage.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>
show more ...
|
Revision tags: v5.15.53, v5.15.52, v5.15.51 |
|
#
f50fff73 |
| 27-Jun-2022 |
Hannes Reinecke <hare@suse.de> |
nvme: implement In-Band authentication
Implement NVMe-oF In-Band authentication according to NVMe TPAR 8006. This patch adds two new fabric options 'dhchap_secret' to specify the pre-shared key (in
nvme: implement In-Band authentication
Implement NVMe-oF In-Band authentication according to NVMe TPAR 8006. This patch adds two new fabric options 'dhchap_secret' to specify the pre-shared key (in ASCII respresentation according to NVMe 2.0 section 8.13.5.8 'Secret representation') and 'dhchap_ctrl_secret' to specify the pre-shared controller key for bi-directional authentication of both the host and the controller. Re-authentication can be triggered by writing the PSK into the new controller sysfs attribute 'dhchap_secret' or 'dhchap_ctrl_secret'.
Signed-off-by: Hannes Reinecke <hare@suse.de> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Christoph Hellwig <hch@lst.de> [axboe: fold in clang build fix] Signed-off-by: Jens Axboe <axboe@kernel.dk>
show more ...
|
#
9bdb4833 |
| 06-Jul-2022 |
John Garry <john.garry@huawei.com> |
blk-mq: Drop blk_mq_ops.timeout 'reserved' arg
With new API blk_mq_is_reserved_rq() we can tell if a request is from the reserved pool, so stop passing 'reserved' arg. There is actually only a singl
blk-mq: Drop blk_mq_ops.timeout 'reserved' arg
With new API blk_mq_is_reserved_rq() we can tell if a request is from the reserved pool, so stop passing 'reserved' arg. There is actually only a single user of that arg for all the callback implementations, which can use blk_mq_is_reserved_rq() instead.
This will also allow us to stop passing the same 'reserved' around the blk-mq iter functions next.
Signed-off-by: John Garry <john.garry@huawei.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Bart Van Assche <bvanassche@acm.org> Reviewed-by: Hannes Reinecke <hare@suse.de> Acked-by: Ulf Hansson <ulf.hansson@linaro.org> # For MMC Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> Link: https://lore.kernel.org/r/1657109034-206040-4-git-send-email-john.garry@huawei.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
show more ...
|
Revision tags: v5.15.50 |
|
#
f7f70f4a |
| 23-Jun-2022 |
Ruozhu Li <liruozhu@huawei.com> |
nvme: fix regression when disconnect a recovering ctrl
We encountered a problem that the disconnect command hangs. After analyzing the log and stack, we found that the triggering process is as follo
nvme: fix regression when disconnect a recovering ctrl
We encountered a problem that the disconnect command hangs. After analyzing the log and stack, we found that the triggering process is as follows: CPU0 CPU1 nvme_rdma_error_recovery_work nvme_rdma_teardown_io_queues nvme_do_delete_ctrl nvme_stop_queues nvme_remove_namespaces --clear ctrl->namespaces nvme_start_queues --no ns in ctrl->namespaces nvme_ns_remove return(because ctrl is deleting) blk_freeze_queue blk_mq_freeze_queue_wait --wait for ns to unquiesce to clean infligt IO, hang forever
This problem was not found in older kernels because we will flush err work in nvme_stop_ctrl before nvme_remove_namespaces.It does not seem to be modified for functional reasons, the patch can be revert to solve the problem.
Revert commit 794a4cb3d2f7 ("nvme: remove the .stop_ctrl callout")
Signed-off-by: Ruozhu Li <liruozhu@huawei.com> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Christoph Hellwig <hch@lst.de>
show more ...
|
#
41d07df7 |
| 26-Jun-2022 |
Sagi Grimberg <sagi@grimberg.me> |
nvme-tcp: always fail a request when sending it failed
queue stoppage and inflight requests cancellation is fully fenced from io_work and thus failing a request from this context. Hence we don't nee
nvme-tcp: always fail a request when sending it failed
queue stoppage and inflight requests cancellation is fully fenced from io_work and thus failing a request from this context. Hence we don't need to try to guess from the socket retcode if this failure is because the queue is about to be torn down or not.
We are perfectly safe to just fail it, the request will not be cancelled later on.
This solves possible very long shutdown delays when the users issues a 'nvme disconnect-all'
Reported-by: Daniel Wagner <dwagner@suse.de> Signed-off-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Christoph Hellwig <hch@lst.de>
show more ...
|
Revision tags: v5.15.49 |
|
#
6f8191fd |
| 19-Jun-2022 |
Christoph Hellwig <hch@lst.de> |
block: simplify disk shutdown
Set the queue dying flag and call blk_mq_exit_queue from del_gendisk for all disks that do not have separately allocated queues, and thus remove the need to call blk_cl
block: simplify disk shutdown
Set the queue dying flag and call blk_mq_exit_queue from del_gendisk for all disks that do not have separately allocated queues, and thus remove the need to call blk_cleanup_queue for them.
Rename blk_cleanup_disk to blk_mq_destroy_queue to make it clear that this function is intended only for separately allocated blk-mq queues.
This saves an extra queue freeze for devices without a separately allocated queue.
Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Hannes Reinecke <hare@suse.de> Link: https://lore.kernel.org/r/20220619060552.1850436-6-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
show more ...
|
Revision tags: v5.15.48, v5.15.47, v5.15.46, v5.15.45, v5.15.44, v5.15.43, v5.15.42, v5.18, v5.15.41, v5.15.40, v5.15.39, v5.15.38, v5.15.37, v5.15.36, v5.15.35, v5.15.34, v5.15.33 |
|
#
93ba75c9 |
| 30-Mar-2022 |
Chaitanya Kulkarni <kch@nvidia.com> |
nvme-fabrics: add a request timeout helper
The RDAMA and TCP transport both complete the timed out request in the same manner and hence code is duplicated. Add and use the helper nvmf_complete_timed
nvme-fabrics: add a request timeout helper
The RDAMA and TCP transport both complete the timed out request in the same manner and hence code is duplicated. Add and use the helper nvmf_complete_timed_out_request() to remove the duplicate code.
Signed-off-by: Chaitanya Kulkarni <kch@nvidia.com> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Christoph Hellwig <hch@lst.de>
show more ...
|
Revision tags: v5.15.32, v5.15.31, v5.17, v5.15.30, v5.15.29, v5.15.28, v5.15.27, v5.15.26, v5.15.25, v5.15.24 |
|
#
841aee4d |
| 15-Feb-2022 |
Chris Leech <cleech@redhat.com> |
nvme-tcp: lockdep: annotate in-kernel sockets
Put NVMe/TCP sockets in their own class to avoid some lockdep warnings. Sockets created by nvme-tcp are not exposed to user-space, and will not trigger
nvme-tcp: lockdep: annotate in-kernel sockets
Put NVMe/TCP sockets in their own class to avoid some lockdep warnings. Sockets created by nvme-tcp are not exposed to user-space, and will not trigger certain code paths that the general socket API exposes.
Lockdep complains about a circular dependency between the socket and filesystem locks, because setsockopt can trigger a page fault with a socket lock held, but nvme-tcp sends requests on the socket while file system locks are held.
====================================================== WARNING: possible circular locking dependency detected 5.15.0-rc3 #1 Not tainted ------------------------------------------------------ fio/1496 is trying to acquire lock: (sk_lock-AF_INET){+.+.}-{0:0}, at: tcp_sendpage+0x23/0x80
but task is already holding lock: (&xfs_dir_ilock_class/5){+.+.}-{3:3}, at: xfs_ilock+0xcf/0x290 [xfs]
which lock already depends on the new lock.
other info that might help us debug this:
chain exists of: sk_lock-AF_INET --> sb_internal --> &xfs_dir_ilock_class/5
Possible unsafe locking scenario:
CPU0 CPU1 ---- ---- lock(&xfs_dir_ilock_class/5); lock(sb_internal); lock(&xfs_dir_ilock_class/5); lock(sk_lock-AF_INET);
*** DEADLOCK ***
6 locks held by fio/1496: #0: (sb_writers#13){.+.+}-{0:0}, at: path_openat+0x9fc/0xa20 #1: (&inode->i_sb->s_type->i_mutex_dir_key){++++}-{3:3}, at: path_openat+0x296/0xa20 #2: (sb_internal){.+.+}-{0:0}, at: xfs_trans_alloc_icreate+0x41/0xd0 [xfs] #3: (&xfs_dir_ilock_class/5){+.+.}-{3:3}, at: xfs_ilock+0xcf/0x290 [xfs] #4: (hctx->srcu){....}-{0:0}, at: hctx_lock+0x51/0xd0 #5: (&queue->send_mutex){+.+.}-{3:3}, at: nvme_tcp_queue_rq+0x33e/0x380 [nvme_tcp]
This annotation lets lockdep analyze nvme-tcp controlled sockets independently of what the user-space sockets API does.
Link: https://lore.kernel.org/linux-nvme/CAHj4cs9MDYLJ+q+2_GXUK9HxFizv2pxUryUR0toX974M040z7g@mail.gmail.com/
Signed-off-by: Chris Leech <cleech@redhat.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
show more ...
|
#
a387935c |
| 22-Feb-2022 |
Chaitanya Kulkarni <kch@nvidia.com> |
nvme-tcp: don't fold the line
The call to nvme_tcp_alloc_queue() fits perfectly in one line without exceeding 80 char limit for the line.
Signed-off-by: Chaitanya Kulkarni <kch@nvidia.com> Reviewed
nvme-tcp: don't fold the line
The call to nvme_tcp_alloc_queue() fits perfectly in one line without exceeding 80 char limit for the line.
Signed-off-by: Chaitanya Kulkarni <kch@nvidia.com> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Christoph Hellwig <hch@lst.de>
show more ...
|
#
462b8b2d |
| 22-Feb-2022 |
Chaitanya Kulkarni <kch@nvidia.com> |
nvme-tcp: don't initialize ret variable
No point in initializing ret variable to 0 in nvme_tcp_start_io_queue() since it gets overwritten by a call to nvme_tcp_start_queue().
Signed-off-by: Chaitan
nvme-tcp: don't initialize ret variable
No point in initializing ret variable to 0 in nvme_tcp_start_io_queue() since it gets overwritten by a call to nvme_tcp_start_queue().
Signed-off-by: Chaitanya Kulkarni <kch@nvidia.com> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Christoph Hellwig <hch@lst.de>
show more ...
|
Revision tags: v5.15.23 |
|
#
72e8b5cd |
| 10-Feb-2022 |
Chaitanya Kulkarni <kch@nvidia.com> |
nvme: add a helper to initialize connect_q
Add and use helper to remove duplicate code for fabrics connect_q initialization and error handling for all the transports.
Signed-off-by: Chaitanya Kulka
nvme: add a helper to initialize connect_q
Add and use helper to remove duplicate code for fabrics connect_q initialization and error handling for all the transports.
Signed-off-by: Chaitanya Kulkarni <kch@nvidia.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
show more ...
|
Revision tags: v5.15.22, v5.15.21, v5.15.20, v5.15.19, v5.15.18, v5.15.17 |
|
#
c2700d28 |
| 22-Jan-2022 |
Varun Prakash <varun@chelsio.com> |
nvme-tcp: send H2CData PDUs based on MAXH2CDATA
As per NVMe/TCP specification (revision 1.0a, section 3.6.2.3) Maximum Host to Controller Data length (MAXH2CDATA): Specifies the maximum number of PD
nvme-tcp: send H2CData PDUs based on MAXH2CDATA
As per NVMe/TCP specification (revision 1.0a, section 3.6.2.3) Maximum Host to Controller Data length (MAXH2CDATA): Specifies the maximum number of PDU-Data bytes per H2CData PDU in bytes. This value is a multiple of dwords and should be no less than 4,096.
Current code sets H2CData PDU data_length to r2t_length, it does not check MAXH2CDATA value. Fix this by setting H2CData PDU data_length to min(req->h2cdata_left, queue->maxh2cdata).
Also validate MAXH2CDATA value returned by target in ICResp PDU, if it is not a multiple of dword or if it is less than 4096 return -EINVAL from nvme_tcp_init_connection().
Signed-off-by: Varun Prakash <varun@chelsio.com> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Christoph Hellwig <hch@lst.de>
show more ...
|
#
63573807 |
| 06-Feb-2022 |
Sagi Grimberg <sagi@grimberg.me> |
nvme-tcp: fix bogus request completion when failing to send AER
AER is not backed by a real request, hence we should not incorrectly assume that when failing to send a nvme command, it is a normal r
nvme-tcp: fix bogus request completion when failing to send AER
AER is not backed by a real request, hence we should not incorrectly assume that when failing to send a nvme command, it is a normal request but rather check if this is an aer and if so complete the aer (similar to the normal completion path).
Cc: stable@vger.kernel.org Signed-off-by: Sagi Grimberg <sagi@grimberg.me> Reviewed-by: Hannes Reinecke <hare@suse.de> Signed-off-by: Christoph Hellwig <hch@lst.de>
show more ...
|
#
ff9fc7eb |
| 01-Feb-2022 |
Sagi Grimberg <sagi@grimberg.me> |
nvme-tcp: fix possible use-after-free in transport error_recovery work
While nvme_tcp_submit_async_event_work is checking the ctrl and queue state before preparing the AER command and scheduling io_
nvme-tcp: fix possible use-after-free in transport error_recovery work
While nvme_tcp_submit_async_event_work is checking the ctrl and queue state before preparing the AER command and scheduling io_work, in order to fully prevent a race where this check is not reliable the error recovery work must flush async_event_work before continuing to destroy the admin queue after setting the ctrl state to RESETTING such that there is no race .submit_async_event and the error recovery handler itself changing the ctrl state.
Tested-by: Chris Leech <cleech@redhat.com> Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
show more ...
|
Revision tags: v5.4.173, v5.15.16, v5.15.15, v5.16, v5.15.10, v5.15.9, v5.15.8, v5.15.7, v5.15.6, v5.15.5, v5.15.4, v5.15.3, v5.15.2, v5.15.1 |
|
#
a5053c92 |
| 03-Nov-2021 |
Maurizio Lombardi <mlombard@redhat.com> |
nvme-tcp: fix memory leak when freeing a queue
Release the page frag cache when tearing down the io queues
Signed-off-by: Maurizio Lombardi <mlombard@redhat.com> Reviewed-by: Sagi Grimberg <sagi@gr
nvme-tcp: fix memory leak when freeing a queue
Release the page frag cache when tearing down the io queues
Signed-off-by: Maurizio Lombardi <mlombard@redhat.com> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Reviewed-by: John Meneghini <jmeneghi@redhat.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
show more ...
|
#
1d3ef9c3 |
| 23-Nov-2021 |
Varun Prakash <varun@chelsio.com> |
nvme-tcp: validate R2T PDU in nvme_tcp_handle_r2t()
If maxh2cdata < r2t_length then driver will form multiple H2CData PDUs, validate R2T PDU in nvme_tcp_handle_r2t() to reuse nvme_tcp_setup_h2c_data
nvme-tcp: validate R2T PDU in nvme_tcp_handle_r2t()
If maxh2cdata < r2t_length then driver will form multiple H2CData PDUs, validate R2T PDU in nvme_tcp_handle_r2t() to reuse nvme_tcp_setup_h2c_data_pdu().
Also set req->state to NVME_TCP_SEND_H2C_PDU in nvme_tcp_setup_h2c_data_pdu().
Signed-off-by: Varun Prakash <varun@chelsio.com> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Christoph Hellwig <hch@lst.de>
show more ...
|