#
a4f40484 |
| 17-Oct-2019 |
Kevin Hao <haokexin@gmail.com> |
nvme-pci: Set the prp2 correctly when using more than 4k page
In the current code, the nvme is using a fixed 4k PRP entry size, but if the kernel use a page size which is more than 4k, we should con
nvme-pci: Set the prp2 correctly when using more than 4k page
In the current code, the nvme is using a fixed 4k PRP entry size, but if the kernel use a page size which is more than 4k, we should consider the situation that the bv_offset may be larger than the dev->ctrl.page_size. Otherwise we may miss setting the prp2 and then cause the command can't be executed correctly.
Fixes: dff824b2aadb ("nvme-pci: optimize mapping of small single segment requests") Cc: stable@vger.kernel.org Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Kevin Hao <haokexin@gmail.com> Signed-off-by: Keith Busch <kbusch@kernel.org>
show more ...
|
#
c1ac9a4b |
| 04-Sep-2019 |
Keith Busch <kbusch@kernel.org> |
nvme: Wait for reset state when required
Prevent simultaneous controller disabling/enabling tasks from interfering with each other through a function to wait until the task successfully transitioned
nvme: Wait for reset state when required
Prevent simultaneous controller disabling/enabling tasks from interfering with each other through a function to wait until the task successfully transitioned the controller to the RESETTING state. This ensures disabling the controller will not be interrupted by another reset path, otherwise a concurrent reset may leave the controller in the wrong state.
Tested-by: Edmund Nadolski <edmund.nadolski@intel.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Keith Busch <kbusch@kernel.org>
show more ...
|
#
5d02a5c1 |
| 03-Sep-2019 |
Keith Busch <kbusch@kernel.org> |
nvme: Remove ADMIN_ONLY state
The admin only state was intended to fence off actions that don't apply to a non-IO capable controller. The only actual user of this is the scan_work, and pci was the o
nvme: Remove ADMIN_ONLY state
The admin only state was intended to fence off actions that don't apply to a non-IO capable controller. The only actual user of this is the scan_work, and pci was the only transport to ever set this state. The consequence of having this state is placing an additional burden on every other action that applies to both live and admin only controllers.
Remove the admin only state and place the admin only burden on the only place that actually cares: scan_work.
This also prepares to make it easier to temporarily pause a LIVE state so that we don't need to remember which state the controller had been in prior to the pause.
Tested-by: Edmund Nadolski <edmund.nadolski@intel.com> Reviewed-by: James Smart <james.smart@broadcom.com> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Keith Busch <kbusch@kernel.org>
show more ...
|
#
770597ec |
| 05-Sep-2019 |
Keith Busch <kbusch@kernel.org> |
nvme-pci: Free tagset if no IO queues
If a controller becomes degraded after a reset, we will not be able to perform any IO. We currently teardown previously created request queues and namespaces, b
nvme-pci: Free tagset if no IO queues
If a controller becomes degraded after a reset, we will not be able to perform any IO. We currently teardown previously created request queues and namespaces, but we had kept the unusable tagset. Free it after all queues using it have been released.
Tested-by: Edmund Nadolski <edmund.nadolski@intel.com> Reviewed-by: James Smart <james.smart@broadcom.com> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Keith Busch <kbusch@kernel.org>
show more ...
|
#
3a8ecc93 |
| 03-Oct-2019 |
Ard Biesheuvel <ard.biesheuvel@linaro.org> |
nvme: retain split access workaround for capability reads
Commit 7fd8930f26be4
"nvme: add a common helper to read Identify Controller data"
has re-introduced an issue that we have attempted to w
nvme: retain split access workaround for capability reads
Commit 7fd8930f26be4
"nvme: add a common helper to read Identify Controller data"
has re-introduced an issue that we have attempted to work around in the past, in commit a310acd7a7ea ("NVMe: use split lo_hi_{read,write}q").
The problem is that some PCIe NVMe controllers do not implement 64-bit outbound accesses correctly, which is why the commit above switched to using lo_hi_[read|write]q for all 64-bit BAR accesses occuring in the code.
In the mean time, the NVMe subsystem has been refactored, and now calls into the PCIe support layer for NVMe via a .reg_read64() method, which fails to use lo_hi_readq(), and thus reintroduces the problem that the workaround above aimed to address.
Given that, at the moment, .reg_read64() is only used to read the capability register [which is known to tolerate split reads], let's switch .reg_read64() to lo_hi_readq() as well.
This fixes a boot issue on some ARM boxes with NVMe behind a Synopsys DesignWare PCIe host controller.
Fixes: 7fd8930f26be4 ("nvme: add a common helper to read Identify Controller data") Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
show more ...
|
#
f03e42c6 |
| 23-Sep-2019 |
Gabriel Craciunescu <nix.or.die@gmail.com> |
Added QUIRKs for ADATA XPG SX8200 Pro 512GB
Booting with default_ps_max_latency_us >6000 makes the device fail. Also SUBNQN is NULL and gives a warning on each boot/resume. $ nvme id-ctrl /dev/nvme
Added QUIRKs for ADATA XPG SX8200 Pro 512GB
Booting with default_ps_max_latency_us >6000 makes the device fail. Also SUBNQN is NULL and gives a warning on each boot/resume. $ nvme id-ctrl /dev/nvme0 | grep ^subnqn subnqn : (null)
I use this device with an Acer Nitro 5 (AN515-43-R8BF) Laptop. To be sure is not a Laptop issue only, I tested the device on my server board with the same results. ( with 2x,4x link on the board and 4x link on a PCI-E card ).
Signed-off-by: Gabriel Craciunescu <nix.or.die@gmail.com> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
show more ...
|
#
7cbb5c6f |
| 18-Sep-2019 |
Mario Limonciello <mario.limonciello@dell.com> |
nvme-pci: Save PCI state before putting drive into deepest state
The action of saving the PCI state will cause numerous PCI configuration space reads which depending upon the vendor implementation m
nvme-pci: Save PCI state before putting drive into deepest state
The action of saving the PCI state will cause numerous PCI configuration space reads which depending upon the vendor implementation may cause the drive to exit the deepest NVMe state.
In these cases ASPM will typically resolve the PCIe link state and APST may resolve the NVMe power state. However it has also been observed that this register access after quiesced will cause PC10 failure on some device combinations.
To resolve this, move the PCI state saving to before SetFeatures has been called. This has been proven to resolve the issue across a 5000 sample test on previously failing disk/system combinations.
Signed-off-by: Mario Limonciello <mario.limonciello@dell.com> Reviewed-by: Keith Busch <kbusch@kernel.org> Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
show more ...
|
#
97b3807e |
| 05-Sep-2019 |
Israel Rukshin <israelr@mellanox.com> |
nvme: Remove redundant assignment of cq vector
The cq vector is already assigned with the correct value.
Signed-off-by: Israel Rukshin <israelr@mellanox.com> Reviewed-by: Max Gurtovoy <maxg@mellano
nvme: Remove redundant assignment of cq vector
The cq vector is already assigned with the correct value.
Signed-off-by: Israel Rukshin <israelr@mellanox.com> Reviewed-by: Max Gurtovoy <maxg@mellanox.com> Reviewed-by: Keith Busch <kbusch@kernel.org> Reviewed-off-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
show more ...
|
#
f2fa006f |
| 28-Aug-2019 |
Israel Rukshin <israelr@mellanox.com> |
nvme-pci: Tidy up nvme_unmap_data
Remove pointless local variable and use rq_dma_dir macro.
Signed-off-by: Israel Rukshin <israelr@mellanox.com> Reviewed-by: Max Gurtovoy <maxg@mellanox.com> Review
nvme-pci: Tidy up nvme_unmap_data
Remove pointless local variable and use rq_dma_dir macro.
Signed-off-by: Israel Rukshin <israelr@mellanox.com> Reviewed-by: Max Gurtovoy <maxg@mellanox.com> Reviewed-by: Keith Busch <kbusch@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
show more ...
|
#
d38e9f04 |
| 07-Aug-2019 |
Benjamin Herrenschmidt <benh@kernel.crashing.org> |
nvme-pci: Support shared tags across queues for Apple 2018 controllers
Another issue with the Apple T2 based 2018 controllers seem to be that they blow up (and shut the machine down) if there's a ta
nvme-pci: Support shared tags across queues for Apple 2018 controllers
Another issue with the Apple T2 based 2018 controllers seem to be that they blow up (and shut the machine down) if there's a tag collision between the IO queue and the Admin queue.
My suspicion is that they use our tags for their internal tracking and don't mix them with the queue id. They also seem to not like when tags go beyond the IO queue depth, ie 128 tags.
This adds a quirk that marks tags 0..31 of the IO queue reserved
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Reviewed-by: Ming Lei <ming.lei@redhat.com> Acked-by: Keith Busch <keith.busch@intel.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
show more ...
|
#
66341331 |
| 07-Aug-2019 |
Benjamin Herrenschmidt <benh@kernel.crashing.org> |
nvme-pci: Add support for Apple 2018+ models
Based on reverse engineering and original patch by
Paul Pawlowski <paul@mrarm.io>
This adds support for Apple weird implementation of NVME in their 201
nvme-pci: Add support for Apple 2018+ models
Based on reverse engineering and original patch by
Paul Pawlowski <paul@mrarm.io>
This adds support for Apple weird implementation of NVME in their 2018 or later machines. It accounts for the twice-as-big SQ entries for the IO queues, and the fact that only interrupt vector 0 appears to function properly.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Reviewed-by: Minwoo Im <minwoo.im.dev@gmail.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
show more ...
|
#
c1e0cc7e |
| 07-Aug-2019 |
Benjamin Herrenschmidt <benh@kernel.crashing.org> |
nvme-pci: Add support for variable IO SQ element size
The size of a submission queue element should always be 6 (64 bytes) by spec.
However some controllers such as Apple's are not properly impleme
nvme-pci: Add support for variable IO SQ element size
The size of a submission queue element should always be 6 (64 bytes) by spec.
However some controllers such as Apple's are not properly implementing the standard and require a different size.
This provides the ground work for the subsequent quirks for these controllers.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Reviewed-by: Minwoo Im <minwoo.im.dev@gmail.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
show more ...
|
#
8a1d09a6 |
| 07-Aug-2019 |
Benjamin Herrenschmidt <benh@kernel.crashing.org> |
nvme-pci: Pass the queue to SQ_SIZE/CQ_SIZE macros
This will make it easier to handle variable queue entry sizes later. No functional change.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.cras
nvme-pci: Pass the queue to SQ_SIZE/CQ_SIZE macros
This will make it easier to handle variable queue entry sizes later. No functional change.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Minwoo Im <minwoo.im.dev@gmail.com> Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
show more ...
|
#
b5b05048 |
| 22-Jul-2019 |
Sagi Grimberg <sagi@grimberg.me> |
nvme: don't pass cap to nvme_disable_ctrl
All seem to call it with ctrl->cap so no need to pass it at all.
Reviewed-by: Minwoo Im <minwoo.im.dev@gmail.com> Reviewed-by: Christoph Hellwig <hch@lst.d
nvme: don't pass cap to nvme_disable_ctrl
All seem to call it with ctrl->cap so no need to pass it at all.
Reviewed-by: Minwoo Im <minwoo.im.dev@gmail.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
show more ...
|
#
c0f2f45b |
| 22-Jul-2019 |
Sagi Grimberg <sagi@grimberg.me> |
nvme: move sqsize setting to the core
nvme_enable_ctrl reads the cap register right after, so no need to do that locally in the transport driver. Have sqsize setting in nvme_init_identify.
Reviewed
nvme: move sqsize setting to the core
nvme_enable_ctrl reads the cap register right after, so no need to do that locally in the transport driver. Have sqsize setting in nvme_init_identify.
Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
show more ...
|
#
aa22c8e6 |
| 22-Aug-2019 |
Sagi Grimberg <sagi@grimberg.me> |
nvme-pci: set ctrl sqsize to the device q_depth
Align with what the rest of the transports are doing.
Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
|
#
cb32de1b |
| 16-Aug-2019 |
Mario Limonciello <mario.limonciello@dell.com> |
nvme: Add quirk for LiteON CL1 devices running FW 22301111
One of the components in LiteON CL1 device has limitations that can be encountered based upon boundary race conditions using the nvme bus s
nvme: Add quirk for LiteON CL1 devices running FW 22301111
One of the components in LiteON CL1 device has limitations that can be encountered based upon boundary race conditions using the nvme bus specific suspend to idle flow.
When this situation occurs the drive doesn't resume properly from suspend-to-idle.
LiteON has confirmed this problem and fixed in the next firmware version. As this firmware is already in the field, avoid running nvme specific suspend to idle flow.
Fixes: d916b1be94b6 ("nvme-pci: use host managed power state for suspend") Link: http://lists.infradead.org/pipermail/linux-nvme/2019-July/thread.html Signed-off-by: Mario Limonciello <mario.limonciello@dell.com> Signed-off-by: Charles Hyde <charles.hyde@dellteam.com> Reviewed-by: Keith Busch <kbusch@kernel.org> Signed-off-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Jens Axboe <axboe@kernel.dk>
show more ...
|
#
7f73eac3 |
| 12-Aug-2019 |
Logan Gunthorpe <logang@deltatee.com> |
PCI/P2PDMA: Introduce pci_p2pdma_unmap_sg()
Add pci_p2pdma_unmap_sg() to the two places that call pci_p2pdma_map_sg().
This is a prep patch to introduce correct mappings for p2pdma transactions tha
PCI/P2PDMA: Introduce pci_p2pdma_unmap_sg()
Add pci_p2pdma_unmap_sg() to the two places that call pci_p2pdma_map_sg().
This is a prep patch to introduce correct mappings for p2pdma transactions that go through the root complex.
Link: https://lore.kernel.org/r/20190730163545.4915-10-logang@deltatee.com Link: https://lore.kernel.org/r/20190812173048.9186-10-logang@deltatee.com Signed-off-by: Logan Gunthorpe <logang@deltatee.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Christoph Hellwig <hch@lst.de>
show more ...
|
#
2b9f4bb2 |
| 12-Aug-2019 |
Logan Gunthorpe <logang@deltatee.com> |
PCI/P2PDMA: Add attrs argument to pci_p2pdma_map_sg()
This is to match the dma_map_sg() API which this function will have to call in an future patch.
Add a pci_p2pdma_map_sg_attrs() function and he
PCI/P2PDMA: Add attrs argument to pci_p2pdma_map_sg()
This is to match the dma_map_sg() API which this function will have to call in an future patch.
Add a pci_p2pdma_map_sg_attrs() function and helper to call it with no attributes just like the dma_map_sg() function.
Link: https://lore.kernel.org/r/20190730163545.4915-9-logang@deltatee.com Link: https://lore.kernel.org/r/20190812173048.9186-9-logang@deltatee.com Signed-off-by: Logan Gunthorpe <logang@deltatee.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Christoph Hellwig <hch@lst.de>
show more ...
|
#
4eaefe8c |
| 08-Aug-2019 |
Rafael J. Wysocki <rafael.j.wysocki@intel.com> |
nvme-pci: Allow PCI bus-level PM to be used if ASPM is disabled
One of the modifications made by commit d916b1be94b6 ("nvme-pci: use host managed power state for suspend") was adding a pci_save_stat
nvme-pci: Allow PCI bus-level PM to be used if ASPM is disabled
One of the modifications made by commit d916b1be94b6 ("nvme-pci: use host managed power state for suspend") was adding a pci_save_state() call to nvme_suspend() so as to instruct the PCI bus type to leave devices handled by the nvme driver in D0 during suspend-to-idle. That was done with the assumption that ASPM would transition the device's PCIe link into a low-power state when the device became inactive. However, if ASPM is disabled for the device, its PCIe link will stay in L0 and in that case commit d916b1be94b6 is likely to cause the energy used by the system while suspended to increase.
Namely, if the device in question works in accordance with the PCIe specification, putting it into D3hot causes its PCIe link to go to L1 or L2/L3 Ready, which is lower-power than L0. Since the energy used by the system while suspended depends on the state of its PCIe link (as a general rule, the lower-power the state of the link, the less energy the system will use), putting the device into D3hot during suspend-to-idle should be more energy-efficient that leaving it in D0 with disabled ASPM.
For this reason, avoid leaving NVMe devices with disabled ASPM in D0 during suspend-to-idle. Instead, shut them down entirely and let the PCI bus type put them into D3.
Fixes: d916b1be94b6 ("nvme-pci: use host managed power state for suspend") Link: https://lore.kernel.org/linux-pm/2763495.NmdaWeg79L@kreacher/T/#t Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Reviewed-by: Keith Busch <keith.busch@intel.com>
show more ...
|
#
622b8b68 |
| 23-Jul-2019 |
Ming Lei <ming.lei@redhat.com> |
nvme: wait until all completed request's complete fn is called
When aborting in-flight request for recovering controller, we have to make sure that queue's complete function is called on completed r
nvme: wait until all completed request's complete fn is called
When aborting in-flight request for recovering controller, we have to make sure that queue's complete function is called on completed request before moving on. Otherwise, for example, the warning of WARN_ON_ONCE(qp->mrs_used > 0) in ib_destroy_qp_user() may be triggered on nvme-rdma.
Fix this issue by using blk_mq_tagset_wait_completed_request.
Cc: Max Gurtovoy <maxg@mellanox.com> Cc: Sagi Grimberg <sagi@grimberg.me> Cc: Keith Busch <keith.busch@intel.com> Cc: Christoph Hellwig <hch@lst.de> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Ming Lei <ming.lei@redhat.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
show more ...
|
#
bd46a906 |
| 29-Jul-2019 |
Keith Busch <kbusch@kernel.org> |
nvme-pci: Fix async probe remove race
Ensure the controller is not in the NEW state when nvme_probe() exits. This will always allow a subsequent nvme_remove() to set the state to DELETING, fixing a
nvme-pci: Fix async probe remove race
Ensure the controller is not in the NEW state when nvme_probe() exits. This will always allow a subsequent nvme_remove() to set the state to DELETING, fixing a potential race between the initial asynchronous probe and device removal.
Reported-by: Li Zhong <lizhongfs@gmail.com> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Keith Busch <kbusch@kernel.org> Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
show more ...
|
#
8fe34be1 |
| 22-Jul-2019 |
yangerkun <yangerkun@huawei.com> |
Revert "nvme-pci: don't create a read hctx mapping without read queues"
This reverts commit 0298d5435276e7795b0b939d74827f6e775e7009.
With this patch, set 'poll_queues > hard queues' will lead to '
Revert "nvme-pci: don't create a read hctx mapping without read queues"
This reverts commit 0298d5435276e7795b0b939d74827f6e775e7009.
With this patch, set 'poll_queues > hard queues' will lead to 'nr_read_queues = 0' in nvme_calc_irq_sets. Then poll_queues setting can fail since dev->tagset.nr_maps equals to 2 and nvme_pci_map_queues will not do map for poll queues.
Signed-off-by: yangerkun <yangerkun@huawei.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
show more ...
|
#
08b903b5 |
| 15-Jul-2019 |
Misha Nasledov <misha@nasledov.com> |
nvme: ignore subnqn for ADATA SX6000LNP
The ADATA SX6000LNP NVMe SSDs have the same subnqn and, due to this, a system with more than one of these SSDs will only have one usable.
[ 0.942706] nvme nv
nvme: ignore subnqn for ADATA SX6000LNP
The ADATA SX6000LNP NVMe SSDs have the same subnqn and, due to this, a system with more than one of these SSDs will only have one usable.
[ 0.942706] nvme nvme1: ignoring ctrl due to duplicate subnqn (nqn.2018-05.com.example:nvme:nvm-subsystem-OUI00E04C). [ 0.943017] nvme nvme1: Removing after probe failure status: -22
02:00.0 Non-Volatile memory controller [0108]: Realtek Semiconductor Co., Ltd. Device [10ec:5762] (rev 01) 71:00.0 Non-Volatile memory controller [0108]: Realtek Semiconductor Co., Ltd. Device [10ec:5762] (rev 01)
There are no firmware updates available from the vendor, unfortunately. Applying the NVME_QUIRK_IGNORE_DEV_SUBNQN quirk for these SSDs resolves the issue, and they all work after this patch:
/dev/nvme0n1 2J1120050420 ADATA SX6000LNP [...] /dev/nvme1n1 2J1120050540 ADATA SX6000LNP [...]
Signed-off-by: Misha Nasledov <misha@nasledov.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
show more ...
|
#
7637de31 |
| 03-Jul-2019 |
Christoph Hellwig <hch@lst.de> |
nvme-pci: limit max_hw_sectors based on the DMA max mapping size
When running a NVMe device that is attached to a addressing challenged PCIe root port that requires bounce buffering, our request siz
nvme-pci: limit max_hw_sectors based on the DMA max mapping size
When running a NVMe device that is attached to a addressing challenged PCIe root port that requires bounce buffering, our request sizes can easily overflow the swiotlb bounce buffer size. Limit the maximum I/O size to the limit exposed by the DMA mapping subsystem.
Signed-off-by: Christoph Hellwig <hch@lst.de> Reported-by: Atish Patra <Atish.Patra@wdc.com> Tested-by: Atish Patra <Atish.Patra@wdc.com> Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
show more ...
|