#
7a2294c5 |
| 23-Jun-2022 |
Ruozhu Li <liruozhu@huawei.com> |
nvme: fix regression when disconnect a recovering ctrl
[ Upstream commit f7f70f4aa09dc43d7455c060143e86a017c30548 ]
We encountered a problem that the disconnect command hangs. After analyzing the l
nvme: fix regression when disconnect a recovering ctrl
[ Upstream commit f7f70f4aa09dc43d7455c060143e86a017c30548 ]
We encountered a problem that the disconnect command hangs. After analyzing the log and stack, we found that the triggering process is as follows: CPU0 CPU1 nvme_rdma_error_recovery_work nvme_rdma_teardown_io_queues nvme_do_delete_ctrl nvme_stop_queues nvme_remove_namespaces --clear ctrl->namespaces nvme_start_queues --no ns in ctrl->namespaces nvme_ns_remove return(because ctrl is deleting) blk_freeze_queue blk_mq_freeze_queue_wait --wait for ns to unquiesce to clean infligt IO, hang forever
This problem was not found in older kernels because we will flush err work in nvme_stop_ctrl before nvme_remove_namespaces.It does not seem to be modified for functional reasons, the patch can be revert to solve the problem.
Revert commit 794a4cb3d2f7 ("nvme: remove the .stop_ctrl callout")
Signed-off-by: Ruozhu Li <liruozhu@huawei.com> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Sasha Levin <sashal@kernel.org>
show more ...
|
#
1e4427aa |
| 26-Jun-2022 |
Sagi Grimberg <sagi@grimberg.me> |
nvme-tcp: always fail a request when sending it failed
[ Upstream commit 41d07df7de841bfbc32725ce21d933ad358f2844 ]
queue stoppage and inflight requests cancellation is fully fenced from io_work an
nvme-tcp: always fail a request when sending it failed
[ Upstream commit 41d07df7de841bfbc32725ce21d933ad358f2844 ]
queue stoppage and inflight requests cancellation is fully fenced from io_work and thus failing a request from this context. Hence we don't need to try to guess from the socket retcode if this failure is because the queue is about to be torn down or not.
We are perfectly safe to just fail it, the request will not be cancelled later on.
This solves possible very long shutdown delays when the users issues a 'nvme disconnect-all'
Reported-by: Daniel Wagner <dwagner@suse.de> Signed-off-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Sasha Levin <sashal@kernel.org>
show more ...
|
#
7a2294c5 |
| 23-Jun-2022 |
Ruozhu Li <liruozhu@huawei.com> |
nvme: fix regression when disconnect a recovering ctrl
[ Upstream commit f7f70f4aa09dc43d7455c060143e86a017c30548 ]
We encountered a problem that the disconnect command hangs. After analyzing the l
nvme: fix regression when disconnect a recovering ctrl
[ Upstream commit f7f70f4aa09dc43d7455c060143e86a017c30548 ]
We encountered a problem that the disconnect command hangs. After analyzing the log and stack, we found that the triggering process is as follows: CPU0 CPU1 nvme_rdma_error_recovery_work nvme_rdma_teardown_io_queues nvme_do_delete_ctrl nvme_stop_queues nvme_remove_namespaces --clear ctrl->namespaces nvme_start_queues --no ns in ctrl->namespaces nvme_ns_remove return(because ctrl is deleting) blk_freeze_queue blk_mq_freeze_queue_wait --wait for ns to unquiesce to clean infligt IO, hang forever
This problem was not found in older kernels because we will flush err work in nvme_stop_ctrl before nvme_remove_namespaces.It does not seem to be modified for functional reasons, the patch can be revert to solve the problem.
Revert commit 794a4cb3d2f7 ("nvme: remove the .stop_ctrl callout")
Signed-off-by: Ruozhu Li <liruozhu@huawei.com> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Sasha Levin <sashal@kernel.org>
show more ...
|
#
1e4427aa |
| 26-Jun-2022 |
Sagi Grimberg <sagi@grimberg.me> |
nvme-tcp: always fail a request when sending it failed
[ Upstream commit 41d07df7de841bfbc32725ce21d933ad358f2844 ]
queue stoppage and inflight requests cancellation is fully fenced from io_work an
nvme-tcp: always fail a request when sending it failed
[ Upstream commit 41d07df7de841bfbc32725ce21d933ad358f2844 ]
queue stoppage and inflight requests cancellation is fully fenced from io_work and thus failing a request from this context. Hence we don't need to try to guess from the socket retcode if this failure is because the queue is about to be torn down or not.
We are perfectly safe to just fail it, the request will not be cancelled later on.
This solves possible very long shutdown delays when the users issues a 'nvme disconnect-all'
Reported-by: Daniel Wagner <dwagner@suse.de> Signed-off-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Sasha Levin <sashal@kernel.org>
show more ...
|
#
7a2294c5 |
| 23-Jun-2022 |
Ruozhu Li <liruozhu@huawei.com> |
nvme: fix regression when disconnect a recovering ctrl
[ Upstream commit f7f70f4aa09dc43d7455c060143e86a017c30548 ]
We encountered a problem that the disconnect command hangs. After analyzing the l
nvme: fix regression when disconnect a recovering ctrl
[ Upstream commit f7f70f4aa09dc43d7455c060143e86a017c30548 ]
We encountered a problem that the disconnect command hangs. After analyzing the log and stack, we found that the triggering process is as follows: CPU0 CPU1 nvme_rdma_error_recovery_work nvme_rdma_teardown_io_queues nvme_do_delete_ctrl nvme_stop_queues nvme_remove_namespaces --clear ctrl->namespaces nvme_start_queues --no ns in ctrl->namespaces nvme_ns_remove return(because ctrl is deleting) blk_freeze_queue blk_mq_freeze_queue_wait --wait for ns to unquiesce to clean infligt IO, hang forever
This problem was not found in older kernels because we will flush err work in nvme_stop_ctrl before nvme_remove_namespaces.It does not seem to be modified for functional reasons, the patch can be revert to solve the problem.
Revert commit 794a4cb3d2f7 ("nvme: remove the .stop_ctrl callout")
Signed-off-by: Ruozhu Li <liruozhu@huawei.com> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Sasha Levin <sashal@kernel.org>
show more ...
|
#
1e4427aa |
| 26-Jun-2022 |
Sagi Grimberg <sagi@grimberg.me> |
nvme-tcp: always fail a request when sending it failed
[ Upstream commit 41d07df7de841bfbc32725ce21d933ad358f2844 ]
queue stoppage and inflight requests cancellation is fully fenced from io_work an
nvme-tcp: always fail a request when sending it failed
[ Upstream commit 41d07df7de841bfbc32725ce21d933ad358f2844 ]
queue stoppage and inflight requests cancellation is fully fenced from io_work and thus failing a request from this context. Hence we don't need to try to guess from the socket retcode if this failure is because the queue is about to be torn down or not.
We are perfectly safe to just fail it, the request will not be cancelled later on.
This solves possible very long shutdown delays when the users issues a 'nvme disconnect-all'
Reported-by: Daniel Wagner <dwagner@suse.de> Signed-off-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Sasha Levin <sashal@kernel.org>
show more ...
|
#
7a2294c5 |
| 23-Jun-2022 |
Ruozhu Li <liruozhu@huawei.com> |
nvme: fix regression when disconnect a recovering ctrl
[ Upstream commit f7f70f4aa09dc43d7455c060143e86a017c30548 ]
We encountered a problem that the disconnect command hangs. After analyzing the l
nvme: fix regression when disconnect a recovering ctrl
[ Upstream commit f7f70f4aa09dc43d7455c060143e86a017c30548 ]
We encountered a problem that the disconnect command hangs. After analyzing the log and stack, we found that the triggering process is as follows: CPU0 CPU1 nvme_rdma_error_recovery_work nvme_rdma_teardown_io_queues nvme_do_delete_ctrl nvme_stop_queues nvme_remove_namespaces --clear ctrl->namespaces nvme_start_queues --no ns in ctrl->namespaces nvme_ns_remove return(because ctrl is deleting) blk_freeze_queue blk_mq_freeze_queue_wait --wait for ns to unquiesce to clean infligt IO, hang forever
This problem was not found in older kernels because we will flush err work in nvme_stop_ctrl before nvme_remove_namespaces.It does not seem to be modified for functional reasons, the patch can be revert to solve the problem.
Revert commit 794a4cb3d2f7 ("nvme: remove the .stop_ctrl callout")
Signed-off-by: Ruozhu Li <liruozhu@huawei.com> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Sasha Levin <sashal@kernel.org>
show more ...
|
#
1e4427aa |
| 26-Jun-2022 |
Sagi Grimberg <sagi@grimberg.me> |
nvme-tcp: always fail a request when sending it failed
[ Upstream commit 41d07df7de841bfbc32725ce21d933ad358f2844 ]
queue stoppage and inflight requests cancellation is fully fenced from io_work an
nvme-tcp: always fail a request when sending it failed
[ Upstream commit 41d07df7de841bfbc32725ce21d933ad358f2844 ]
queue stoppage and inflight requests cancellation is fully fenced from io_work and thus failing a request from this context. Hence we don't need to try to guess from the socket retcode if this failure is because the queue is about to be torn down or not.
We are perfectly safe to just fail it, the request will not be cancelled later on.
This solves possible very long shutdown delays when the users issues a 'nvme disconnect-all'
Reported-by: Daniel Wagner <dwagner@suse.de> Signed-off-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Sasha Levin <sashal@kernel.org>
show more ...
|
#
7a2294c5 |
| 23-Jun-2022 |
Ruozhu Li <liruozhu@huawei.com> |
nvme: fix regression when disconnect a recovering ctrl
[ Upstream commit f7f70f4aa09dc43d7455c060143e86a017c30548 ]
We encountered a problem that the disconnect command hangs. After analyzing the l
nvme: fix regression when disconnect a recovering ctrl
[ Upstream commit f7f70f4aa09dc43d7455c060143e86a017c30548 ]
We encountered a problem that the disconnect command hangs. After analyzing the log and stack, we found that the triggering process is as follows: CPU0 CPU1 nvme_rdma_error_recovery_work nvme_rdma_teardown_io_queues nvme_do_delete_ctrl nvme_stop_queues nvme_remove_namespaces --clear ctrl->namespaces nvme_start_queues --no ns in ctrl->namespaces nvme_ns_remove return(because ctrl is deleting) blk_freeze_queue blk_mq_freeze_queue_wait --wait for ns to unquiesce to clean infligt IO, hang forever
This problem was not found in older kernels because we will flush err work in nvme_stop_ctrl before nvme_remove_namespaces.It does not seem to be modified for functional reasons, the patch can be revert to solve the problem.
Revert commit 794a4cb3d2f7 ("nvme: remove the .stop_ctrl callout")
Signed-off-by: Ruozhu Li <liruozhu@huawei.com> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Sasha Levin <sashal@kernel.org>
show more ...
|
#
1e4427aa |
| 26-Jun-2022 |
Sagi Grimberg <sagi@grimberg.me> |
nvme-tcp: always fail a request when sending it failed
[ Upstream commit 41d07df7de841bfbc32725ce21d933ad358f2844 ]
queue stoppage and inflight requests cancellation is fully fenced from io_work an
nvme-tcp: always fail a request when sending it failed
[ Upstream commit 41d07df7de841bfbc32725ce21d933ad358f2844 ]
queue stoppage and inflight requests cancellation is fully fenced from io_work and thus failing a request from this context. Hence we don't need to try to guess from the socket retcode if this failure is because the queue is about to be torn down or not.
We are perfectly safe to just fail it, the request will not be cancelled later on.
This solves possible very long shutdown delays when the users issues a 'nvme disconnect-all'
Reported-by: Daniel Wagner <dwagner@suse.de> Signed-off-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Sasha Levin <sashal@kernel.org>
show more ...
|
#
ffe0c491 |
| 15-Feb-2022 |
Chris Leech <cleech@redhat.com> |
nvme-tcp: lockdep: annotate in-kernel sockets
[ Upstream commit 841aee4d75f18fdfb53935080b03de0c65e9b92c ]
Put NVMe/TCP sockets in their own class to avoid some lockdep warnings. Sockets created by
nvme-tcp: lockdep: annotate in-kernel sockets
[ Upstream commit 841aee4d75f18fdfb53935080b03de0c65e9b92c ]
Put NVMe/TCP sockets in their own class to avoid some lockdep warnings. Sockets created by nvme-tcp are not exposed to user-space, and will not trigger certain code paths that the general socket API exposes.
Lockdep complains about a circular dependency between the socket and filesystem locks, because setsockopt can trigger a page fault with a socket lock held, but nvme-tcp sends requests on the socket while file system locks are held.
====================================================== WARNING: possible circular locking dependency detected 5.15.0-rc3 #1 Not tainted ------------------------------------------------------ fio/1496 is trying to acquire lock: (sk_lock-AF_INET){+.+.}-{0:0}, at: tcp_sendpage+0x23/0x80
but task is already holding lock: (&xfs_dir_ilock_class/5){+.+.}-{3:3}, at: xfs_ilock+0xcf/0x290 [xfs]
which lock already depends on the new lock.
other info that might help us debug this:
chain exists of: sk_lock-AF_INET --> sb_internal --> &xfs_dir_ilock_class/5
Possible unsafe locking scenario:
CPU0 CPU1 ---- ---- lock(&xfs_dir_ilock_class/5); lock(sb_internal); lock(&xfs_dir_ilock_class/5); lock(sk_lock-AF_INET);
*** DEADLOCK ***
6 locks held by fio/1496: #0: (sb_writers#13){.+.+}-{0:0}, at: path_openat+0x9fc/0xa20 #1: (&inode->i_sb->s_type->i_mutex_dir_key){++++}-{3:3}, at: path_openat+0x296/0xa20 #2: (sb_internal){.+.+}-{0:0}, at: xfs_trans_alloc_icreate+0x41/0xd0 [xfs] #3: (&xfs_dir_ilock_class/5){+.+.}-{3:3}, at: xfs_ilock+0xcf/0x290 [xfs] #4: (hctx->srcu){....}-{0:0}, at: hctx_lock+0x51/0xd0 #5: (&queue->send_mutex){+.+.}-{3:3}, at: nvme_tcp_queue_rq+0x33e/0x380 [nvme_tcp]
This annotation lets lockdep analyze nvme-tcp controlled sockets independently of what the user-space sockets API does.
Link: https://lore.kernel.org/linux-nvme/CAHj4cs9MDYLJ+q+2_GXUK9HxFizv2pxUryUR0toX974M040z7g@mail.gmail.com/
Signed-off-by: Chris Leech <cleech@redhat.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Sasha Levin <sashal@kernel.org>
show more ...
|
#
5e42fca3 |
| 01-Feb-2022 |
Sagi Grimberg <sagi@grimberg.me> |
nvme-tcp: fix possible use-after-free in transport error_recovery work
[ Upstream commit ff9fc7ebf5c06de1ef72a69f9b1ab40af8b07f9e ]
While nvme_tcp_submit_async_event_work is checking the ctrl and q
nvme-tcp: fix possible use-after-free in transport error_recovery work
[ Upstream commit ff9fc7ebf5c06de1ef72a69f9b1ab40af8b07f9e ]
While nvme_tcp_submit_async_event_work is checking the ctrl and queue state before preparing the AER command and scheduling io_work, in order to fully prevent a race where this check is not reliable the error recovery work must flush async_event_work before continuing to destroy the admin queue after setting the ctrl state to RESETTING such that there is no race .submit_async_event and the error recovery handler itself changing the ctrl state.
Tested-by: Chris Leech <cleech@redhat.com> Signed-off-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Sasha Levin <sashal@kernel.org>
show more ...
|
#
64c37c05 |
| 06-Feb-2022 |
Sagi Grimberg <sagi@grimberg.me> |
nvme-tcp: fix bogus request completion when failing to send AER
commit 63573807b27e0faf8065a28b1bbe1cbfb23c0130 upstream.
AER is not backed by a real request, hence we should not incorrectly assume
nvme-tcp: fix bogus request completion when failing to send AER
commit 63573807b27e0faf8065a28b1bbe1cbfb23c0130 upstream.
AER is not backed by a real request, hence we should not incorrectly assume that when failing to send a nvme command, it is a normal request but rather check if this is an aer and if so complete the aer (similar to the normal completion path).
Cc: stable@vger.kernel.org Signed-off-by: Sagi Grimberg <sagi@grimberg.me> Reviewed-by: Hannes Reinecke <hare@suse.de> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
show more ...
|
#
d89b9f3b |
| 25-Oct-2021 |
Varun Prakash <varun@chelsio.com> |
nvme-tcp: fix data digest pointer calculation
ddgst is of type __le32, &req->ddgst + req->offset increases &req->ddgst by 4 * req->offset, fix this by type casting &req->ddgst to u8 *.
Fixes: 3f230
nvme-tcp: fix data digest pointer calculation
ddgst is of type __le32, &req->ddgst + req->offset increases &req->ddgst by 4 * req->offset, fix this by type casting &req->ddgst to u8 *.
Fixes: 3f2304f8c6d6 ("nvme-tcp: add NVMe over TCP host driver") Signed-off-by: Varun Prakash <varun@chelsio.com> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Christoph Hellwig <hch@lst.de>
show more ...
|
#
ce7723e9 |
| 26-Oct-2021 |
Varun Prakash <varun@chelsio.com> |
nvme-tcp: fix possible req->offset corruption
With commit db5ad6b7f8cd ("nvme-tcp: try to send request in queue_rq context") r2t and response PDU can get processed while send function is executing.
nvme-tcp: fix possible req->offset corruption
With commit db5ad6b7f8cd ("nvme-tcp: try to send request in queue_rq context") r2t and response PDU can get processed while send function is executing.
Current data digest send code uses req->offset after kernel_sendmsg(), this creates a race condition where req->offset gets reset before it is used in send function.
This can happen in two cases - 1. Target sends r2t PDU which resets req->offset. 2. Target send response PDU which completes the req and then req is used for a new command, nvme_tcp_setup_cmd_pdu() resets req->offset.
Fix this by storing req->offset in a local variable and using this local variable after kernel_sendmsg().
Fixes: db5ad6b7f8cd ("nvme-tcp: try to send request in queue_rq context") Signed-off-by: Varun Prakash <varun@chelsio.com> Reviewed-by: Keith Busch <kbusch@kernel.org> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Christoph Hellwig <hch@lst.de>
show more ...
|
#
25e1f67e |
| 24-Oct-2021 |
Sagi Grimberg <sagi@grimberg.me> |
nvme-tcp: fix H2CData PDU send accounting (again)
We should not access request members after the last send, even to determine if indeed it was the last data payload send. The reason is that a comple
nvme-tcp: fix H2CData PDU send accounting (again)
We should not access request members after the last send, even to determine if indeed it was the last data payload send. The reason is that a completion could have arrived and trigger a new execution of the request which overridden these members. This was fixed by commit 825619b09ad3 ("nvme-tcp: fix possible use-after-completion").
Commit e371af033c56 broke that assumption again to address cases where multiple r2t pdus are sent per request. To fix it, we need to record the request data_sent and data_len and after the payload network send we reference these counters to determine weather we should advance the request iterator.
Fixes: e371af033c56 ("nvme-tcp: fix incorrect h2cdata pdu offset accounting") Reported-by: Keith Busch <kbusch@kernel.org> Cc: stable@vger.kernel.org # 5.10+ Signed-off-by: Sagi Grimberg <sagi@grimberg.me> Reviewed-by: Keith Busch <kbusch@kernel.org> Signed-off-by: Christoph Hellwig <hch@lst.de>
show more ...
|
Revision tags: v5.14.6, v5.10.67, v5.10.66, v5.14.5, v5.14.4, v5.10.65 |
|
#
e371af03 |
| 14-Sep-2021 |
Sagi Grimberg <sagi@grimberg.me> |
nvme-tcp: fix incorrect h2cdata pdu offset accounting
When the controller sends us multiple r2t PDUs in a single request we need to account for it correctly as our send/recv context run concurrently
nvme-tcp: fix incorrect h2cdata pdu offset accounting
When the controller sends us multiple r2t PDUs in a single request we need to account for it correctly as our send/recv context run concurrently (i.e. we get a new r2t with r2t_offset before we updated our iterator and req->data_sent marker). This can cause wrong offsets to be sent to the controller.
To fix that, we will first know that this may happen only in the send sequence of the last page, hence we will take the r2t_offset to the h2c PDU data_offset, and in nvme_tcp_try_send_data loop, we make sure to increment the request markers also when we completed a PDU but we are expecting more r2t PDUs as we still did not send the entire data of the request.
Fixes: 825619b09ad3 ("nvme-tcp: fix possible use-after-completion") Reported-by: Nowak, Lukasz <Lukasz.Nowak@Dell.com> Tested-by: Nowak, Lukasz <Lukasz.Nowak@Dell.com> Signed-off-by: Sagi Grimberg <sagi@grimberg.me> Reviewed-by: Keith Busch <kbusch@kernel.org> Signed-off-by: Christoph Hellwig <hch@lst.de>
show more ...
|
Revision tags: v5.14.3, v5.10.64 |
|
#
70f437fb |
| 09-Sep-2021 |
Keith Busch <kbusch@kernel.org> |
nvme-tcp: fix io_work priority inversion
Dispatching requests inline with the .queue_rq() call may block while holding the send_mutex. If the tcp io_work also happens to schedule, it may see the req
nvme-tcp: fix io_work priority inversion
Dispatching requests inline with the .queue_rq() call may block while holding the send_mutex. If the tcp io_work also happens to schedule, it may see the req_list is non-empty, leaving "pending" true and remaining in TASK_RUNNING. Since io_work is of higher scheduling priority, the .queue_rq task may not get a chance to run, blocking forward progress and leading to io timeouts.
Instead of checking for pending requests within io_work, let the queueing restart io_work outside the send_mutex lock if there is more work to be done.
Fixes: a0fdd1418007f ("nvme-tcp: rerun io_work if req_list is not empty") Reported-by: Samuel Jones <sjones@kalrayinc.com> Signed-off-by: Keith Busch <kbusch@kernel.org> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Christoph Hellwig <hch@lst.de>
show more ...
|
Revision tags: v5.14.2, v5.10.63, v5.14.1, v5.10.62 |
|
#
1ba2e507 |
| 30-Aug-2021 |
Daniel Wagner <dwagner@suse.de> |
nvme-tcp: Do not reset transport on data digest errors
The spec says
7.4.6.1 Digest Error handling
When a host detects a data digest error in a C2HData PDU, that host shall continue processi
nvme-tcp: Do not reset transport on data digest errors
The spec says
7.4.6.1 Digest Error handling
When a host detects a data digest error in a C2HData PDU, that host shall continue processing C2HData PDUs associated with the command and when the command processing has completed, if a successful status was returned by the controller, the host shall fail the command with a non-fatal transport error.
Currently the transport is reseted when a data digest error is detected. Instead, when a digest error is detected, mark the final status as NVME_SC_DATA_XFER_ERROR and let the upper layer handle the error.
In order to keep track of the final result maintain a status field in nvme_tcp_request object and use it to overwrite the completion queue status (which might be successful even though a digest error has been detected) when completing the request.
Signed-off-by: Daniel Wagner <dwagner@suse.de> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Reviewed-by: Hannes Reinecke <hare@suse.de> Signed-off-by: Christoph Hellwig <hch@lst.de>
show more ...
|
Revision tags: v5.14, v5.10.61, v5.10.60 |
|
#
664227fd |
| 06-Aug-2021 |
Ruozhu Li <liruozhu@huawei.com> |
nvme-tcp: don't update queue count when failing to set io queues
We update ctrl->queue_count and schedule another reconnect when io queue count is zero.But we will never try to create any io queue i
nvme-tcp: don't update queue count when failing to set io queues
We update ctrl->queue_count and schedule another reconnect when io queue count is zero.But we will never try to create any io queue in next reco- nnection, because ctrl->queue_count already set to zero.We will end up having an admin-only session in Live state, which is exactly what we try to avoid in the original patch. Update ctrl->queue_count after queue_count zero checking to fix it.
Signed-off-by: Ruozhu Li <liruozhu@huawei.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
show more ...
|
#
d48f92cd |
| 06-Aug-2021 |
Keith Busch <kbusch@kernel.org> |
nvme-tcp: pair send_mutex init with destroy
Each mutex_init() should have a corresponding mutex_destroy().
Signed-off-by: Keith Busch <kbusch@kernel.org> Reviewed-by: Sagi Grimberg <sagi@grimberg.m
nvme-tcp: pair send_mutex init with destroy
Each mutex_init() should have a corresponding mutex_destroy().
Signed-off-by: Keith Busch <kbusch@kernel.org> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Christoph Hellwig <hch@lst.de>
show more ...
|
Revision tags: v5.10.53, v5.10.52, v5.10.51, v5.10.50, v5.10.49, v5.13, v5.10.46 |
|
#
e7006de6 |
| 16-Jun-2021 |
Sagi Grimberg <sagi@grimberg.me> |
nvme: code command_id with a genctr for use-after-free validation
We cannot detect a (perhaps buggy) controller that is sending us a completion for a request that was already completed (for example
nvme: code command_id with a genctr for use-after-free validation
We cannot detect a (perhaps buggy) controller that is sending us a completion for a request that was already completed (for example sending a completion twice), this phenomenon was seen in the wild a few times.
So to protect against this, we use the upper 4 msbits of the nvme sqe command_id to use as a 4-bit generation counter and verify it matches the existing request generation that is incrementing on every execution.
The 16-bit command_id structure now is constructed by: | xxxx | xxxxxxxxxxxx | gen request tag
This means that we are giving up some possible queue depth as 12 bits allow for a maximum queue depth of 4095 instead of 65536, however we never create such long queues anyways so no real harm done.
Suggested-by: Keith Busch <kbusch@kernel.org> Signed-off-by: Sagi Grimberg <sagi@grimberg.me> Acked-by: Keith Busch <kbusch@kernel.org> Reviewed-by: Hannes Reinecke <hare@suse.de> Reviewed-by: Daniel Wagner <dwagner@suse.de> Tested-by: Daniel Wagner <dwagner@suse.de> Signed-off-by: Christoph Hellwig <hch@lst.de>
show more ...
|
#
3b01a9d0 |
| 16-Jun-2021 |
Sagi Grimberg <sagi@grimberg.me> |
nvme-tcp: don't check blk_mq_tag_to_rq when receiving pdu data
We already validate it when receiving the c2hdata pdu header and this is not changing so this is a redundant check.
Reviewed-by: Hanne
nvme-tcp: don't check blk_mq_tag_to_rq when receiving pdu data
We already validate it when receiving the c2hdata pdu header and this is not changing so this is a redundant check.
Reviewed-by: Hannes Reinecke <hare@suse.de> Signed-off-by: Sagi Grimberg <sagi@grimberg.me> Reviewed-by: Daniel Wagner <dwagner@suse.de> Signed-off-by: Christoph Hellwig <hch@lst.de>
show more ...
|
#
8b43ced6 |
| 13-Jul-2021 |
Prabhakar Kushwaha <pkushwaha@marvell.com> |
nvme-tcp: use __dev_get_by_name instead dev_get_by_name for OPT_HOST_IFACE
dev_get_by_name() finds network device by name but it also increases the reference count.
If a nvme-tcp queue is present a
nvme-tcp: use __dev_get_by_name instead dev_get_by_name for OPT_HOST_IFACE
dev_get_by_name() finds network device by name but it also increases the reference count.
If a nvme-tcp queue is present and the network device driver is removed before nvme_tcp, we will face the following continuous log:
"kernel:unregister_netdevice: waiting for <eth> to become free. Usage count = 2"
And rmmod further halts. Similar case arises during reboot/shutdown with nvme-tcp queue present and both never completes.
To fix this, use __dev_get_by_name() which finds network device by name without increasing any reference counter.
Fixes: 3ede8f72a9a2 ("nvme-tcp: allow selecting the network interface for connections") Signed-off-by: Omkar Kulkarni <okulkarni@marvell.com> Signed-off-by: Shai Malin <smalin@marvell.com> Signed-off-by: Prabhakar Kushwaha <pkushwaha@marvell.com> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> [hch: remove the ->ndev member entirely] Signed-off-by: Christoph Hellwig <hch@lst.de>
show more ...
|
#
be42a33b |
| 10-Jun-2021 |
Keith Busch <kbusch@kernel.org> |
nvme: use blk_execute_rq() for passthrough commands
The generic blk_execute_rq() knows how to handle polled completions. Use that instead of implementing an nvme specific handler.
Signed-off-by: Ke
nvme: use blk_execute_rq() for passthrough commands
The generic blk_execute_rq() knows how to handle polled completions. Use that instead of implementing an nvme specific handler.
Signed-off-by: Keith Busch <kbusch@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com> Link: https://lore.kernel.org/r/20210610214437.641245-3-kbusch@kernel.org Signed-off-by: Jens Axboe <axboe@kernel.dk>
show more ...
|