History log of /openbmc/linux/drivers/nvme/target/rdma.c (Results 1 – 25 of 164)
Revision (<<< Hide revision tags) (Show revision tags >>>) Date Author Comments
Revision tags: v6.6.25, v6.6.24, v6.6.23, v6.6.16, v6.6.15, v6.6.14, v6.6.13, v6.6.12, v6.6.11, v6.6.10, v6.6.9, v6.6.8, v6.6.7, v6.6.6, v6.6.5, v6.6.4, v6.6.3, v6.6.2, v6.5.11, v6.6.1, v6.5.10, v6.6, v6.5.9, v6.5.8, v6.5.7, v6.5.6, v6.5.5, v6.5.4, v6.5.3, v6.5.2, v6.1.51, v6.5.1, v6.1.50, v6.5, v6.1.49, v6.1.48, v6.1.46, v6.1.45, v6.1.44, v6.1.43, v6.1.42, v6.1.41, v6.1.40, v6.1.39, v6.1.38, v6.1.37, v6.1.36, v6.4, v6.1.35, v6.1.34, v6.1.33, v6.1.32, v6.1.31, v6.1.30, v6.1.29, v6.1.28, v6.1.27, v6.1.26, v6.3, v6.1.25, v6.1.24, v6.1.23, v6.1.22, v6.1.21, v6.1.20, v6.1.19, v6.1.18, v6.1.17, v6.1.16, v6.1.15, v6.1.14, v6.1.13, v6.2, v6.1.12, v6.1.11, v6.1.10, v6.1.9, v6.1.8, v6.1.7, v6.1.6, v6.1.5, v6.0.19, v6.0.18, v6.1.4, v6.1.3, v6.0.17, v6.1.2, v6.0.16, v6.1.1, v6.0.15, v6.0.14, v6.0.13, v6.1, v6.0.12, v6.0.11, v6.0.10, v5.15.80, v6.0.9, v5.15.79, v6.0.8, v5.15.78, v6.0.7, v5.15.77, v5.15.76, v6.0.6, v6.0.5, v5.15.75, v6.0.4, v6.0.3, v6.0.2, v5.15.74, v5.15.73, v6.0.1, v5.15.72, v6.0, v5.15.71, v5.15.70, v5.15.69, v5.15.68, v5.15.67, v5.15.66, v5.15.65, v5.15.64, v5.15.63, v5.15.62, v5.15.61, v5.15.60, v5.15.59, v5.19, v5.15.58, v5.15.57, v5.15.56, v5.15.55, v5.15.54
# 495758bb 08-Jul-2022 Logan Gunthorpe <logang@deltatee.com>

RDMA/core: introduce ib_dma_pci_p2p_dma_supported()

Introduce the helper function ib_dma_pci_p2p_dma_supported() to check
if a given ib_device can be used in P2PDMA transfers. This ensures
the ib_de

RDMA/core: introduce ib_dma_pci_p2p_dma_supported()

Introduce the helper function ib_dma_pci_p2p_dma_supported() to check
if a given ib_device can be used in P2PDMA transfers. This ensures
the ib_device is not using virt_dma and also that the underlying
dma_device supports P2PDMA.

Use the new helper in nvme-rdma to replace the existing check for
ib_uses_virt_dma(). Adding the dma_pci_p2pdma_supported() check allows
switching away from pci_p2pdma_[un]map_sg().

Signed-off-by: Logan Gunthorpe <logang@deltatee.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Reviewed-by: Max Gurtovoy <mgurtovoy@nvidia.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Christoph Hellwig <hch@lst.de>

show more ...


Revision tags: v5.15.53, v5.15.52, v5.15.51, v5.15.50, v5.15.49, v5.15.48, v5.15.47, v5.15.46, v5.15.45, v5.15.44, v5.15.43, v5.15.42, v5.18, v5.15.41, v5.15.40, v5.15.39, v5.15.38, v5.15.37, v5.15.36, v5.15.35, v5.15.34, v5.15.33
# e945c653 04-Apr-2022 Jason Gunthorpe <jgg@nvidia.com>

RDMA: Split kernel-only global device caps from uverbs device caps

Split out flags from ib_device::device_cap_flags that are only used
internally to the kernel into kernel_cap_flags that is not part

RDMA: Split kernel-only global device caps from uverbs device caps

Split out flags from ib_device::device_cap_flags that are only used
internally to the kernel into kernel_cap_flags that is not part of the
uapi. This limits the device_cap_flags to being the same bitmap that will
be copied to userspace.

This cleanly splits out the uverbs flags from the kernel flags to avoid
confusion in the flags bitmap.

Add some short comments describing which each of the kernel flags is
connected to. Remove unused kernel flags.

Link: https://lore.kernel.org/r/0-v2-22c19e565eef+139a-kern_caps_jgg@nvidia.com
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Max Gurtovoy <mgurtovoy@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>

show more ...


Revision tags: v5.15.32, v5.15.31
# 8832cf92 21-Mar-2022 Sagi Grimberg <sagi@grimberg.me>

nvmet: use a private workqueue instead of the system workqueue

Any attempt to flush kernel-global WQs has possibility of deadlock
so we should simply stop using them, instead introduce nvmet_wq
whic

nvmet: use a private workqueue instead of the system workqueue

Any attempt to flush kernel-global WQs has possibility of deadlock
so we should simply stop using them, instead introduce nvmet_wq
which is the generic nvmet workqueue for work elements that
don't explicitly require a dedicated workqueue (by the mere fact
that they are using the system_wq).

Changes were done using the following replaces:

- s/schedule_work(/queue_work(nvmet_wq, /g
- s/schedule_delayed_work(/queue_delayed_work(nvmet_wq, /g
- s/flush_scheduled_work()/flush_workqueue(nvmet_wq)/g

Reported-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>

show more ...


Revision tags: v5.17, v5.15.30, v5.15.29, v5.15.28
# a8adf0cd 08-Mar-2022 Chaitanya Kulkarni <kch@nvidia.com>

nvmet-rdma: fix kernel-doc warning for nvmet_rdma_device_removal

This fixes following kernel-doc warning:-

drivers/nvme/target/rdma.c:1722: warning: expecting prototype for nvme_rdma_device_removal

nvmet-rdma: fix kernel-doc warning for nvmet_rdma_device_removal

This fixes following kernel-doc warning:-

drivers/nvme/target/rdma.c:1722: warning: expecting prototype for nvme_rdma_device_removal(). Prototype was for nvmet_rdma_device_removal() instead

Signed-off-by: Chaitanya Kulkarni <kch@nvidia.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>

show more ...


Revision tags: v5.15.27, v5.15.26, v5.15.25, v5.15.24
# 7c256639 14-Feb-2022 Sagi Grimberg <sagi@grimberg.me>

nvmet-rdma: replace ida_simple[get|remove] with the simler ida_[alloc|free]

ida_simple_[get|remove] are wrappers anyways.

Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
Reviewed-by: Keith Busch <k

nvmet-rdma: replace ida_simple[get|remove] with the simler ida_[alloc|free]

ida_simple_[get|remove] are wrappers anyways.

Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
Reviewed-by: Keith Busch <kbusch@kernel.org>
Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>

show more ...


Revision tags: v5.15.23, v5.15.22, v5.15.21, v5.15.20, v5.15.19, v5.15.18, v5.15.17, v5.4.173, v5.15.16, v5.15.15, v5.16, v5.15.10, v5.15.9, v5.15.8, v5.15.7, v5.15.6, v5.15.5, v5.15.4, v5.15.3, v5.15.2, v5.15.1, v5.15, v5.14.14, v5.14.13, v5.14.12, v5.14.11, v5.14.10, v5.14.9, v5.14.8
# c7d792f9 22-Sep-2021 Max Gurtovoy <mgurtovoy@nvidia.com>

nvmet-rdma: implement get_max_queue_size controller op

Limit the maximal queue size for RDMA controllers. Today, the target
reports a limit of 1024 and this limit isn't valid for some of the RDMA
ba

nvmet-rdma: implement get_max_queue_size controller op

Limit the maximal queue size for RDMA controllers. Today, the target
reports a limit of 1024 and this limit isn't valid for some of the RDMA
based controllers. For now, limit RDMA transport to 128 entries (the
max queue depth configured for Linux NVMe/RDMA host).

Future general solution should use RDMA/core API to calculate this size
according to device capabilities and number of WRs needed per NVMe IO
request.

Reported-by: Mark Ruijter <mruijter@primelogic.nl>
Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com>
Signed-off-by: Max Gurtovoy <mgurtovoy@nvidia.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>

show more ...


# fcf73a80 06-Oct-2021 Israel Rukshin <israelr@nvidia.com>

nvmet-rdma: fix use-after-free when a port is removed

When removing a port, all its controllers are being removed, but there
are queues on the port that doesn't belong to any controller (during
conn

nvmet-rdma: fix use-after-free when a port is removed

When removing a port, all its controllers are being removed, but there
are queues on the port that doesn't belong to any controller (during
connection time). This causes a use-after-free bug for any command
that dereferences req->port (like in nvmet_alloc_ctrl). Those queues
should be destroyed before freeing the port via configfs. Destroy the
remaining queues after the RDMA-CM was destroyed guarantees that no
new queue will be created.

Signed-off-by: Israel Rukshin <israelr@nvidia.com>
Reviewed-by: Max Gurtovoy <mgurtovoy@nvidia.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>

show more ...


Revision tags: v5.14.7
# fe45e630 20-Sep-2021 Christoph Hellwig <hch@lst.de>

block: move integrity handling out of <linux/blkdev.h>

Split the integrity/metadata handling definitions out into a new header.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Johannes T

block: move integrity handling out of <linux/blkdev.h>

Split the integrity/metadata handling definitions out into a new header.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Link: https://lore.kernel.org/r/20210920123328.1399408-17-hch@lst.de
Signed-off-by: Jens Axboe <axboe@kernel.dk>

show more ...


# d44ff3b1 21-Mar-2022 Sagi Grimberg <sagi@grimberg.me>

nvmet: use a private workqueue instead of the system workqueue

[ Upstream commit 8832cf922151e9dfa2821736beb0ae2dd3968b6e ]

Any attempt to flush kernel-global WQs has possibility of deadlock
so we

nvmet: use a private workqueue instead of the system workqueue

[ Upstream commit 8832cf922151e9dfa2821736beb0ae2dd3968b6e ]

Any attempt to flush kernel-global WQs has possibility of deadlock
so we should simply stop using them, instead introduce nvmet_wq
which is the generic nvmet workqueue for work elements that
don't explicitly require a dedicated workqueue (by the mere fact
that they are using the system_wq).

Changes were done using the following replaces:

- s/schedule_work(/queue_work(nvmet_wq, /g
- s/schedule_delayed_work(/queue_delayed_work(nvmet_wq, /g
- s/flush_scheduled_work()/flush_workqueue(nvmet_wq)/g

Reported-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>

show more ...


# 3c82292e 06-Oct-2021 Israel Rukshin <israelr@nvidia.com>

nvmet-rdma: fix use-after-free when a port is removed

[ Upstream commit fcf73a804c7d6bbf0ea63531c6122aa363852e04 ]

When removing a port, all its controllers are being removed, but there
are queues

nvmet-rdma: fix use-after-free when a port is removed

[ Upstream commit fcf73a804c7d6bbf0ea63531c6122aa363852e04 ]

When removing a port, all its controllers are being removed, but there
are queues on the port that doesn't belong to any controller (during
connection time). This causes a use-after-free bug for any command
that dereferences req->port (like in nvmet_alloc_ctrl). Those queues
should be destroyed before freeing the port via configfs. Destroy the
remaining queues after the RDMA-CM was destroyed guarantees that no
new queue will be created.

Signed-off-by: Israel Rukshin <israelr@nvidia.com>
Reviewed-by: Max Gurtovoy <mgurtovoy@nvidia.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>

show more ...


Revision tags: v5.14.6, v5.10.67, v5.10.66, v5.14.5, v5.14.4, v5.10.65, v5.14.3, v5.10.64, v5.14.2, v5.10.63, v5.14.1, v5.10.62, v5.14, v5.10.61, v5.10.60, v5.10.53, v5.10.52, v5.10.51, v5.10.50, v5.10.49, v5.13, v5.10.46
# 8abd7e2a 16-Jun-2021 Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com>

nvmet: remove zeroout memset call for struct

Declare and initialize structure variables to zero values so that we can
remove zeroout memset calls in the target/rdma.c.

Signed-off-by: Chaitanya Kulk

nvmet: remove zeroout memset call for struct

Declare and initialize structure variables to zero values so that we can
remove zeroout memset calls in the target/rdma.c.

Signed-off-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>

show more ...


Revision tags: v5.10.43, v5.10.42, v5.10.41, v5.10.40, v5.10.39, v5.4.119, v5.10.36, v5.10.35
# 8cc365f9 06-May-2021 Michal Kalderon <michal.kalderon@marvell.com>

nvmet-rdma: Fix NULL deref when SEND is completed with error

When running some traffic and taking down the link on peer, a
retry counter exceeded error is received. This leads to
nvmet_rdma_error_co

nvmet-rdma: Fix NULL deref when SEND is completed with error

When running some traffic and taking down the link on peer, a
retry counter exceeded error is received. This leads to
nvmet_rdma_error_comp which tried accessing the cq_context to
obtain the queue. The cq_context is no longer valid after the
fix to use shared CQ mechanism and should be obtained similar
to how it is obtained in other functions from the wc->qp.

[ 905.786331] nvmet_rdma: SEND for CQE 0x00000000e3337f90 failed with status transport retry counter exceeded (12).
[ 905.832048] BUG: unable to handle kernel NULL pointer dereference at 0000000000000048
[ 905.839919] PGD 0 P4D 0
[ 905.842464] Oops: 0000 1 SMP NOPTI
[ 905.846144] CPU: 13 PID: 1557 Comm: kworker/13:1H Kdump: loaded Tainted: G OE --------- - - 4.18.0-304.el8.x86_64 #1
[ 905.872135] RIP: 0010:nvmet_rdma_error_comp+0x5/0x1b [nvmet_rdma]
[ 905.878259] Code: 19 4f c0 e8 89 b3 a5 f6 e9 5b e0 ff ff 0f b7 75 14 4c 89 ea 48 c7 c7 08 1a 4f c0 e8 71 b3 a5 f6 e9 4b e0 ff ff 0f 1f 44 00 00 <48> 8b 47 48 48 85 c0 74 08 48 89 c7 e9 98 bf 49 00 e9 c3 e3 ff ff
[ 905.897135] RSP: 0018:ffffab601c45fe28 EFLAGS: 00010246
[ 905.902387] RAX: 0000000000000065 RBX: ffff9e729ea2f800 RCX: 0000000000000000
[ 905.909558] RDX: 0000000000000000 RSI: ffff9e72df9567c8 RDI: 0000000000000000
[ 905.916731] RBP: ffff9e729ea2b400 R08: 000000000000074d R09: 0000000000000074
[ 905.923903] R10: 0000000000000000 R11: ffffab601c45fcc0 R12: 0000000000000010
[ 905.931074] R13: 0000000000000000 R14: 0000000000000010 R15: ffff9e729ea2f400
[ 905.938247] FS: 0000000000000000(0000) GS:ffff9e72df940000(0000) knlGS:0000000000000000
[ 905.938249] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 905.950067] nvmet_rdma: SEND for CQE 0x00000000c7356cca failed with status transport retry counter exceeded (12).
[ 905.961855] CR2: 0000000000000048 CR3: 000000678d010004 CR4: 00000000007706e0
[ 905.961855] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 905.961856] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[ 905.961857] PKRU: 55555554
[ 906.010315] Call Trace:
[ 906.012778] __ib_process_cq+0x89/0x170 [ib_core]
[ 906.017509] ib_cq_poll_work+0x26/0x80 [ib_core]
[ 906.022152] process_one_work+0x1a7/0x360
[ 906.026182] ? create_worker+0x1a0/0x1a0
[ 906.030123] worker_thread+0x30/0x390
[ 906.033802] ? create_worker+0x1a0/0x1a0
[ 906.037744] kthread+0x116/0x130
[ 906.040988] ? kthread_flush_work_fn+0x10/0x10
[ 906.045456] ret_from_fork+0x1f/0x40

Fixes: ca0f1a8055be2 ("nvmet-rdma: use new shared CQ mechanism")
Signed-off-by: Shai Malin <smalin@marvell.com>
Signed-off-by: Michal Kalderon <michal.kalderon@marvell.com>
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
Signed-off-by: Christoph Hellwig <hch@lst.de>

show more ...


Revision tags: v5.10.34, v5.4.116, v5.10.33, v5.12, v5.10.32, v5.10.31, v5.10.30, v5.10.27, v5.10.26, v5.10.25, v5.10.24, v5.10.23
# abec6561 10-Mar-2021 Lv Yunlong <lyl2019@mail.ustc.edu.cn>

nvme-rdma: Fix a use after free in nvmet_rdma_write_data_done

In nvmet_rdma_write_data_done, rsp is recoverd by wc->wr_cqe and freed by
nvmet_rdma_release_rsp(). But after that, pr_info() used the f

nvme-rdma: Fix a use after free in nvmet_rdma_write_data_done

In nvmet_rdma_write_data_done, rsp is recoverd by wc->wr_cqe and freed by
nvmet_rdma_release_rsp(). But after that, pr_info() used the freed
chunk's member object and could leak the freed chunk address with
wc->wr_cqe by computing the offset.

Signed-off-by: Lv Yunlong <lyl2019@mail.ustc.edu.cn>
Signed-off-by: Christoph Hellwig <hch@lst.de>

show more ...


Revision tags: v5.10.22, v5.10.21, v5.10.20, v5.10.19, v5.4.101, v5.10.18, v5.10.17, v5.11, v5.10.16, v5.10.15, v5.10.14
# 7a846656 10-Jan-2021 Israel Rukshin <israelr@nvidia.com>

nvmet-rdma: Fix NULL deref when setting pi_enable and traddr INADDR_ANY

When setting port traddr to INADDR_ANY, the listening cm_id->device
is NULL. The associate IB device is known only when a conn

nvmet-rdma: Fix NULL deref when setting pi_enable and traddr INADDR_ANY

When setting port traddr to INADDR_ANY, the listening cm_id->device
is NULL. The associate IB device is known only when a connect request
event arrives, so checking T10-PI device capability should be done
at this stage.

Fixes: b09160c3996c ("nvmet-rdma: add metadata/T10-PI support")
Signed-off-by: Israel Rukshin <israelr@nvidia.com>
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
Reviewed-by: Max Gurtovoy <mgurtovoy@nvidia.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>

show more ...


# 9ceb7863 05-Jan-2021 Israel Rukshin <israelr@nvidia.com>

nvmet-rdma: Fix list_del corruption on queue establishment failure

When a queue is in NVMET_RDMA_Q_CONNECTING state, it may has some
requests at rsp_wait_list. In case a disconnect occurs at this
st

nvmet-rdma: Fix list_del corruption on queue establishment failure

When a queue is in NVMET_RDMA_Q_CONNECTING state, it may has some
requests at rsp_wait_list. In case a disconnect occurs at this
state, no one will empty this list and will return the requests to
free_rsps list. Normally nvmet_rdma_queue_established() free those
requests after moving the queue to NVMET_RDMA_Q_LIVE state, but in
this case __nvmet_rdma_queue_disconnect() is called before. The
crash happens at nvmet_rdma_free_rsps() when calling
list_del(&rsp->free_list), because the request exists only at
the wait list. To fix the issue, simply clear rsp_wait_list when
destroying the queue.

Signed-off-by: Israel Rukshin <israelr@nvidia.com>
Reviewed-by: Max Gurtovoy <mgurtovoy@nvidia.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>

show more ...


Revision tags: v5.10
# 5a7a9e03 06-Nov-2020 Christoph Hellwig <hch@lst.de>

RDMA/core: remove use of dma_virt_ops

Use the ib_dma_* helpers to skip the DMA translation instead. This
removes the last user if dma_virt_ops and keeps the weird layering
violation inside the RDMA

RDMA/core: remove use of dma_virt_ops

Use the ib_dma_* helpers to skip the DMA translation instead. This
removes the last user if dma_virt_ops and keeps the weird layering
violation inside the RDMA core instead of burderning the DMA mapping
subsystems with it. This also means the software RDMA drivers now don't
have to mess with DMA parameters that are not relevant to them at all, and
that in the future we can use PCI P2P transfers even for software RDMA, as
there is no first fake layer of DMA mapping that the P2P DMA support.

Link: https://lore.kernel.org/r/20201106181941.1878556-8-hch@lst.de
Signed-off-by: Christoph Hellwig <hch@lst.de>
Tested-by: Mike Marciniszyn <mike.marciniszyn@cornelisnetworks.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>

show more ...


# 64f3410c 06-May-2021 Michal Kalderon <michal.kalderon@marvell.com>

nvmet-rdma: Fix NULL deref when SEND is completed with error

[ Upstream commit 8cc365f9559b86802afc0208389f5c8d46b4ad61 ]

When running some traffic and taking down the link on peer, a
retry counter

nvmet-rdma: Fix NULL deref when SEND is completed with error

[ Upstream commit 8cc365f9559b86802afc0208389f5c8d46b4ad61 ]

When running some traffic and taking down the link on peer, a
retry counter exceeded error is received. This leads to
nvmet_rdma_error_comp which tried accessing the cq_context to
obtain the queue. The cq_context is no longer valid after the
fix to use shared CQ mechanism and should be obtained similar
to how it is obtained in other functions from the wc->qp.

[ 905.786331] nvmet_rdma: SEND for CQE 0x00000000e3337f90 failed with status transport retry counter exceeded (12).
[ 905.832048] BUG: unable to handle kernel NULL pointer dereference at 0000000000000048
[ 905.839919] PGD 0 P4D 0
[ 905.842464] Oops: 0000 1 SMP NOPTI
[ 905.846144] CPU: 13 PID: 1557 Comm: kworker/13:1H Kdump: loaded Tainted: G OE --------- - - 4.18.0-304.el8.x86_64 #1
[ 905.872135] RIP: 0010:nvmet_rdma_error_comp+0x5/0x1b [nvmet_rdma]
[ 905.878259] Code: 19 4f c0 e8 89 b3 a5 f6 e9 5b e0 ff ff 0f b7 75 14 4c 89 ea 48 c7 c7 08 1a 4f c0 e8 71 b3 a5 f6 e9 4b e0 ff ff 0f 1f 44 00 00 <48> 8b 47 48 48 85 c0 74 08 48 89 c7 e9 98 bf 49 00 e9 c3 e3 ff ff
[ 905.897135] RSP: 0018:ffffab601c45fe28 EFLAGS: 00010246
[ 905.902387] RAX: 0000000000000065 RBX: ffff9e729ea2f800 RCX: 0000000000000000
[ 905.909558] RDX: 0000000000000000 RSI: ffff9e72df9567c8 RDI: 0000000000000000
[ 905.916731] RBP: ffff9e729ea2b400 R08: 000000000000074d R09: 0000000000000074
[ 905.923903] R10: 0000000000000000 R11: ffffab601c45fcc0 R12: 0000000000000010
[ 905.931074] R13: 0000000000000000 R14: 0000000000000010 R15: ffff9e729ea2f400
[ 905.938247] FS: 0000000000000000(0000) GS:ffff9e72df940000(0000) knlGS:0000000000000000
[ 905.938249] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 905.950067] nvmet_rdma: SEND for CQE 0x00000000c7356cca failed with status transport retry counter exceeded (12).
[ 905.961855] CR2: 0000000000000048 CR3: 000000678d010004 CR4: 00000000007706e0
[ 905.961855] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 905.961856] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[ 905.961857] PKRU: 55555554
[ 906.010315] Call Trace:
[ 906.012778] __ib_process_cq+0x89/0x170 [ib_core]
[ 906.017509] ib_cq_poll_work+0x26/0x80 [ib_core]
[ 906.022152] process_one_work+0x1a7/0x360
[ 906.026182] ? create_worker+0x1a0/0x1a0
[ 906.030123] worker_thread+0x30/0x390
[ 906.033802] ? create_worker+0x1a0/0x1a0
[ 906.037744] kthread+0x116/0x130
[ 906.040988] ? kthread_flush_work_fn+0x10/0x10
[ 906.045456] ret_from_fork+0x1f/0x40

Fixes: ca0f1a8055be2 ("nvmet-rdma: use new shared CQ mechanism")
Signed-off-by: Shai Malin <smalin@marvell.com>
Signed-off-by: Michal Kalderon <michal.kalderon@marvell.com>
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>

show more ...


# 8f0534c9 10-Mar-2021 Lv Yunlong <lyl2019@mail.ustc.edu.cn>

nvme-rdma: Fix a use after free in nvmet_rdma_write_data_done

[ Upstream commit abec6561fc4e0fbb19591a0b35676d8c783b5493 ]

In nvmet_rdma_write_data_done, rsp is recoverd by wc->wr_cqe and freed by

nvme-rdma: Fix a use after free in nvmet_rdma_write_data_done

[ Upstream commit abec6561fc4e0fbb19591a0b35676d8c783b5493 ]

In nvmet_rdma_write_data_done, rsp is recoverd by wc->wr_cqe and freed by
nvmet_rdma_release_rsp(). But after that, pr_info() used the freed
chunk's member object and could leak the freed chunk address with
wc->wr_cqe by computing the offset.

Signed-off-by: Lv Yunlong <lyl2019@mail.ustc.edu.cn>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>

show more ...


# a60c7aaa 10-Jan-2021 Israel Rukshin <israelr@nvidia.com>

nvmet-rdma: Fix NULL deref when setting pi_enable and traddr INADDR_ANY

commit 7a84665619bb5da8c8b6517157875a1fd7632014 upstream.

When setting port traddr to INADDR_ANY, the listening cm_id->device

nvmet-rdma: Fix NULL deref when setting pi_enable and traddr INADDR_ANY

commit 7a84665619bb5da8c8b6517157875a1fd7632014 upstream.

When setting port traddr to INADDR_ANY, the listening cm_id->device
is NULL. The associate IB device is known only when a connect request
event arrives, so checking T10-PI device capability should be done
at this stage.

Fixes: b09160c3996c ("nvmet-rdma: add metadata/T10-PI support")
Signed-off-by: Israel Rukshin <israelr@nvidia.com>
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
Reviewed-by: Max Gurtovoy <mgurtovoy@nvidia.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

show more ...


# 8e57baf3 05-Jan-2021 Israel Rukshin <israelr@nvidia.com>

nvmet-rdma: Fix list_del corruption on queue establishment failure

[ Upstream commit 9ceb7863537748c67fa43ac4f2f565819bbd36e4 ]

When a queue is in NVMET_RDMA_Q_CONNECTING state, it may has some
req

nvmet-rdma: Fix list_del corruption on queue establishment failure

[ Upstream commit 9ceb7863537748c67fa43ac4f2f565819bbd36e4 ]

When a queue is in NVMET_RDMA_Q_CONNECTING state, it may has some
requests at rsp_wait_list. In case a disconnect occurs at this
state, no one will empty this list and will return the requests to
free_rsps list. Normally nvmet_rdma_queue_established() free those
requests after moving the queue to NVMET_RDMA_Q_LIVE state, but in
this case __nvmet_rdma_queue_disconnect() is called before. The
crash happens at nvmet_rdma_free_rsps() when calling
list_del(&rsp->free_list), because the request exists only at
the wait list. To fix the issue, simply clear rsp_wait_list when
destroying the queue.

Signed-off-by: Israel Rukshin <israelr@nvidia.com>
Reviewed-by: Max Gurtovoy <mgurtovoy@nvidia.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>

show more ...


# 404fa093 06-Nov-2020 Christoph Hellwig <hch@lst.de>

RDMA/core: remove use of dma_virt_ops

[ Upstream commit 5a7a9e038b032137ae9c45d5429f18a2ffdf7d42 ]

Use the ib_dma_* helpers to skip the DMA translation instead. This
removes the last user if dma_v

RDMA/core: remove use of dma_virt_ops

[ Upstream commit 5a7a9e038b032137ae9c45d5429f18a2ffdf7d42 ]

Use the ib_dma_* helpers to skip the DMA translation instead. This
removes the last user if dma_virt_ops and keeps the weird layering
violation inside the RDMA core instead of burderning the DMA mapping
subsystems with it. This also means the software RDMA drivers now don't
have to mess with DMA parameters that are not relevant to them at all, and
that in the future we can use PCI P2P transfers even for software RDMA, as
there is no first fake layer of DMA mapping that the P2P DMA support.

Link: https://lore.kernel.org/r/20201106181941.1878556-8-hch@lst.de
Signed-off-by: Christoph Hellwig <hch@lst.de>
Tested-by: Mike Marciniszyn <mike.marciniszyn@cornelisnetworks.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>

show more ...


Revision tags: v5.8.17, v5.8.16, v5.8.15, v5.9, v5.8.14, v5.8.13, v5.8.12, v5.8.11, v5.8.10, v5.8.9, v5.8.8, v5.8.7, v5.8.6, v5.4.62, v5.8.5, v5.8.4, v5.4.61
# df561f66 23-Aug-2020 Gustavo A. R. Silva <gustavoars@kernel.org>

treewide: Use fallthrough pseudo-keyword

Replace the existing /* fall through */ comments and its variants with
the new pseudo-keyword macro fallthrough[1]. Also, remove unnecessary
fall-through mar

treewide: Use fallthrough pseudo-keyword

Replace the existing /* fall through */ comments and its variants with
the new pseudo-keyword macro fallthrough[1]. Also, remove unnecessary
fall-through markings when it is the case.

[1] https://www.kernel.org/doc/html/v5.7/process/deprecated.html?highlight=fallthrough#implicit-switch-case-fall-through

Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>

show more ...


Revision tags: v5.8.3, v5.4.60, v5.8.2, v5.4.59, v5.8.1, v5.4.58, v5.4.57, v5.4.56, v5.8, v5.7.12, v5.4.55, v5.7.11, v5.4.54, v5.7.10, v5.4.53, v5.4.52, v5.7.9
# ca0f1a80 13-Jul-2020 Yamin Friedman <yaminf@mellanox.com>

nvmet-rdma: use new shared CQ mechanism

Has the driver use shared CQs providing ~10%-20% improvement when
multiple disks are used. Instead of opening a CQ for each QP per
controller, a CQ for each c

nvmet-rdma: use new shared CQ mechanism

Has the driver use shared CQs providing ~10%-20% improvement when
multiple disks are used. Instead of opening a CQ for each QP per
controller, a CQ for each core will be provided by the RDMA core driver
that will be shared between the QPs on that core reducing interrupt
overhead.

Signed-off-by: Yamin Friedman <yaminf@mellanox.com>
Signed-off-by: Max Gurtovoy <maxg@mellanox.com>
Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com>
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
Signed-off-by: Christoph Hellwig <hch@lst.de>

show more ...


Revision tags: v5.7.8, v5.4.51, v5.4.50, v5.7.7, v5.4.49, v5.7.6, v5.7.5, v5.4.48, v5.7.4, v5.7.3, v5.4.47, v5.4.46, v5.7.2, v5.4.45, v5.7.1, v5.4.44
# 6fa350f7 02-Jun-2020 Max Gurtovoy <maxg@mellanox.com>

nvmet: introduce flags member in nvmet_fabrics_ops

Replace has_keyed_sgls and metadata_support booleans with a flags member
that will be used for adding more features in the future.

Signed-off-by:

nvmet: introduce flags member in nvmet_fabrics_ops

Replace has_keyed_sgls and metadata_support booleans with a flags member
that will be used for adding more features in the future.

Signed-off-by: Max Gurtovoy <maxg@mellanox.com>
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Reviewed-by: Israel Rukshin <israelr@mellanox.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>

show more ...


Revision tags: v5.7, v5.4.43
# 8094ba0a 26-May-2020 Leon Romanovsky <leonro@mellanox.com>

RDMA/cma: Provide ECE reject reason

IBTA declares "vendor option not supported" reject reason in REJ messages
if passive side doesn't want to accept proposed ECE options.

Due to the fact that ECE i

RDMA/cma: Provide ECE reject reason

IBTA declares "vendor option not supported" reject reason in REJ messages
if passive side doesn't want to accept proposed ECE options.

Due to the fact that ECE is managed by userspace, there is a need to let
users to provide such rejected reason.

Link: https://lore.kernel.org/r/20200526103304.196371-7-leon@kernel.org
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>

show more ...


1234567