#
40267efd |
| 16-Sep-2016 |
Simon A. F. Lund <slund@cnexlabs.com> |
lightnvm: expose device geometry through sysfs
For a host to access an Open-Channel SSD, it has to know its geometry, so that it writes and reads at the appropriate device bounds.
Currently, the ge
lightnvm: expose device geometry through sysfs
For a host to access an Open-Channel SSD, it has to know its geometry, so that it writes and reads at the appropriate device bounds.
Currently, the geometry information is kept within the kernel, and not exported to user-space for consumption. This patch exposes the configuration through sysfs and enables user-space libraries, such as liblightnvm, to use the sysfs implementation to get the geometry of an Open-Channel SSD.
The sysfs entries are stored within the device hierarchy, and can be found using the "lightnvm" device type.
An example configuration looks like this:
/sys/class/nvme/ └── nvme0n1 ├── capabilities: 3 ├── device_mode: 1 ├── erase_max: 1000000 ├── erase_typ: 1000000 ├── flash_media_type: 0 ├── media_capabilities: 0x00000001 ├── media_type: 0 ├── multiplane: 0x00010101 ├── num_blocks: 1022 ├── num_channels: 1 ├── num_luns: 4 ├── num_pages: 64 ├── num_planes: 1 ├── page_size: 4096 ├── prog_max: 100000 ├── prog_typ: 100000 ├── read_max: 10000 ├── read_typ: 10000 ├── sector_oob_size: 0 ├── sector_size: 4096 ├── media_manager: gennvm ├── ppa_format: 0x380830082808001010102008 ├── vendor_opcode: 0 ├── max_phys_secs: 64 └── version: 1
Signed-off-by: Simon A. F. Lund <slund@cnexlabs.com> Signed-off-by: Matias Bjørling <m@bjorling.me> Signed-off-by: Jens Axboe <axboe@fb.com>
show more ...
|
#
b0b4e09c |
| 16-Sep-2016 |
Matias Bjørling <m@bjorling.me> |
lightnvm: control life of nvm_dev in driver
LightNVM compatible device drivers does not have a method to expose LightNVM specific sysfs entries.
To enable LightNVM sysfs entries to be exposed, ligh
lightnvm: control life of nvm_dev in driver
LightNVM compatible device drivers does not have a method to expose LightNVM specific sysfs entries.
To enable LightNVM sysfs entries to be exposed, lightnvm device drivers require a struct device to attach it to. To allow both the actual device driver and lightnvm sysfs entries to coexist, the device driver tracks the lifetime of the nvm_dev structure.
This patch refactors NVMe and null_blk to handle the lifetime of struct nvm_dev, which eliminates the need for struct gendisk when a lightnvm compatible device is provided.
Signed-off-by: Matias Bjørling <m@bjorling.me> Signed-off-by: Jens Axboe <axboe@fb.com>
show more ...
|
#
b5af7f2f |
| 14-Sep-2016 |
Christoph Hellwig <hch@lst.de> |
nvme: remove the post_scan callout
No need now that we don't have to reverse engineer the irq affinity.
Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Keith Busch <keith.busch@intel.com
nvme: remove the post_scan callout
No need now that we don't have to reverse engineer the irq affinity.
Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Keith Busch <keith.busch@intel.com> Signed-off-by: Jens Axboe <axboe@fb.com>
show more ...
|
#
f80ec966 |
| 12-Jul-2016 |
Keith Busch <keith.busch@intel.com> |
nvme: Limit command retries
Many controller implementations will return errors to commands that will not succeed, but without the DNR bit set. The driver previously retried these commands an unlimit
nvme: Limit command retries
Many controller implementations will return errors to commands that will not succeed, but without the DNR bit set. The driver previously retried these commands an unlimited number of times until the command timeout has exceeded, which takes an unnecessarilly long period of time.
This patch limits the number of retries a command can have, defaulting to 5, but is user tunable at load or runtime.
The struct request's 'retries' field is used to track the number of retries attempted. This is in contrast with scsi's use of this field, which indicates how many retries are allowed.
Signed-off-by: Keith Busch <keith.busch@intel.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@fb.com>
show more ...
|
#
54adc010 |
| 14-Jun-2016 |
Guilherme G. Piccoli <gpiccoli@linux.vnet.ibm.com> |
nvme/quirk: Add a delay before checking for adapter readiness
When disabling the controller, the specification says the register NVME_REG_CC should be written and then driver needs to wait the adapt
nvme/quirk: Add a delay before checking for adapter readiness
When disabling the controller, the specification says the register NVME_REG_CC should be written and then driver needs to wait the adapter to be ready, which is checked by reading another register bit (NVME_CSTS_RDY). There's a timeout validation in this checking, so in case this timeout is reached the driver gives up and removes the adapter from the system.
After a firmware activation procedure, the PCI_DEVICE(0x1c58, 0x0003) (HGST adapter) end up being removed if we issue a reset_controller, because driver keeps verifying the NVME_REG_CSTS until the timeout is reached. This patch adds a necessary quirk for this adapter, by introducing a delay before nvme_wait_ready(), so the reset procedure is able to be completed. This quirk is needed because just increasing the timeout is not enough in case of this adapter - the driver must wait before start reading NVME_REG_CSTS register on this specific device.
Signed-off-by: Guilherme G. Piccoli <gpiccoli@linux.vnet.ibm.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@fb.com>
show more ...
|
#
def61eca |
| 06-Jul-2016 |
Christoph Hellwig <hch@lst.de> |
nvme: add new reconnecting controller state
The nvme fabric (RDMA, FC, etc...) can introduce port, link or node failures that may require a reconnect to re-establish the connection.
Add a new recon
nvme: add new reconnecting controller state
The nvme fabric (RDMA, FC, etc...) can introduce port, link or node failures that may require a reconnect to re-establish the connection.
Add a new reconnecting state that will initially be used by the RDMA driver.
Reviewed-by: Jay Freyensee <james.p.freyensee@intel.com> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Steve Wise <swise@opengridcomputing.com> Tested-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Jens Axboe <axboe@fb.com>
show more ...
|
#
038bd4cb |
| 13-Jun-2016 |
Sagi Grimberg <sagi@grimberg.me> |
nvme: add keep-alive support
Periodic keep-alive is a mandatory feature in NVMe over Fabrics, and optional in NVMe 1.2.1 for PCIe. This patch adds periodic keep-alive sent from the host to verify t
nvme: add keep-alive support
Periodic keep-alive is a mandatory feature in NVMe over Fabrics, and optional in NVMe 1.2.1 for PCIe. This patch adds periodic keep-alive sent from the host to verify that the controller is still responsive and vice-versa. The keep-alive timeout is user-defined (with keep_alive_tmo connection parameter) and defaults to 5 seconds.
In order to avoid a race condition where the host sends a keep-alive competing with the target side keep-alive timeout expiration, the host adds a grace period of 10 seconds when publishing the keep-alive timeout to the target.
In case a keep-alive failed (or timed out), a transport specific error recovery kicks in.
For now only NVMe over Fabrics is wired up to support keep alive, but we can add PCIe support easily once controllers actually supporting it become available.
Signed-off-by: Sagi Grimberg <sagi@grimberg.me> Reviewed-by: Steve Wise <swise@chelsio.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Keith Busch <keith.busch@intel.com> Signed-off-by: Jens Axboe <axboe@fb.com>
show more ...
|
#
07bfcd09 |
| 13-Jun-2016 |
Christoph Hellwig <hch@lst.de> |
nvme-fabrics: add a generic NVMe over Fabrics library
The NVMe over Fabrics library provides an interface for both transports and the nvme core to handle fabrics specific commands and attributes ind
nvme-fabrics: add a generic NVMe over Fabrics library
The NVMe over Fabrics library provides an interface for both transports and the nvme core to handle fabrics specific commands and attributes independent of the underlying transport.
In addition, the fabrics library adds a misc device interface that allow actually creating a fabrics controller, as we can't just autodiscover it like in the PCI case. The nvme-cli utility has been enhanced to use this interface to support fabric connect and discovery.
Signed-off-by: Armen Baloyan <armenx.baloyan@intel.com>, Signed-off-by: Jay Freyensee <james.p.freyensee@intel.com>, Signed-off-by: Ming Lin <ming.l@ssi.samsung.com> Signed-off-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Keith Busch <keith.busch@intel.com> Signed-off-by: Jens Axboe <axboe@fb.com>
show more ...
|
#
1a353d85 |
| 13-Jun-2016 |
Ming Lin <ming.l@ssi.samsung.com> |
nvme: add fabrics sysfs attributes
- delete_controller: This attribute allows to delete a controller. A driver is not obligated to support it (pci doesn't) so it is created only if the driver su
nvme: add fabrics sysfs attributes
- delete_controller: This attribute allows to delete a controller. A driver is not obligated to support it (pci doesn't) so it is created only if the driver supports it. The new fabrics drivers will support it (essentialy a disconnect operation).
Usage: echo > /sys/class/nvme/nvme0/delete_controller
- subsysnqn: This attribute shows the subsystem nqn of the configured device. If a driver does not implement the get_subsysnqn method, the file will not appear in sysfs.
- transport: This attribute shows the transport name. Added a "name" field to struct nvme_ctrl_ops.
For loop, cat /sys/class/nvme/nvme0/transport loop
For RDMA, cat /sys/class/nvme/nvme0/transport rdma
For PCIe, cat /sys/class/nvme/nvme0/transport pcie
- address: This attributes shows the controller address. The fabrics drivers that will implement get_address can show the address of the connected controller.
example: cat /sys/class/nvme/nvme0/address traddr=192.168.2.2,trsvcid=1023
Signed-off-by: Ming Lin <ming.l@ssi.samsung.com> Reviewed-by: Jay Freyensee <james.p.freyensee@intel.com> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Keith Busch <keith.busch@intel.com> Signed-off-by: Jens Axboe <axboe@fb.com>
show more ...
|
#
eb71f435 |
| 13-Jun-2016 |
Christoph Hellwig <hch@lst.de> |
nvme: Modify and export sync command submission for fabrics
NVMe over fabrics will use __nvme_submit_sync_cmd in the the transport and require a few tweaks to it. For that we export it and add a fe
nvme: Modify and export sync command submission for fabrics
NVMe over fabrics will use __nvme_submit_sync_cmd in the the transport and require a few tweaks to it. For that we export it and add a few more paramters:
1. allow passing a queue ID to the block layer
For the NVMe over Fabrics connect command we need to able to specify a queue ID that we want to send the command on. Add a qid parameter to the relevant functions to enable this behavior.
2. allow submitting at_head commands
In cases where we want to (re)connect to a controller where we have inflight queued commands we want to first connect and only then allow the other queued commands to be kicked. This will prevents failures in controller resets and reconnects.
3. allow passing flags to blk_mq_allocate_request
Both for Fabrics connect the the keep-alive feature in NVMe 1.2.1 we want to be able to use reserved requests.
Reviewed-by: Jay Freyensee <james.p.freyensee@intel.com> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Tested-by: Ming Lin <ming.l@ssi.samsung.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Keith Busch <keith.busch@intel.com> Signed-off-by: Jens Axboe <axboe@fb.com>
show more ...
|
#
c55a2fd4 |
| 18-May-2016 |
Ming Lin <ming.l@samsung.com> |
nvme: move nvme_cancel_request() to common code
So it can be used by fabrics driver also.
Signed-off-by: Ming Lin <ming.l@samsung.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Johan
nvme: move nvme_cancel_request() to common code
So it can be used by fabrics driver also.
Signed-off-by: Ming Lin <ming.l@samsung.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de> Reviewed-by: Keith Busch <keith.bsuch@intel.com> Signed-off-by: Jens Axboe <axboe@fb.com>
show more ...
|
#
c2df40df |
| 05-Jun-2016 |
Mike Christie <mchristi@redhat.com> |
drivers: use req op accessor
The req operation REQ_OP is separated from the rq_flag_bits definition. This converts the block layer drivers to use req_op to get the op from the request struct.
Signe
drivers: use req op accessor
The req operation REQ_OP is separated from the rq_flag_bits definition. This converts the block layer drivers to use req_op to get the op from the request struct.
Signed-off-by: Mike Christie <mchristi@redhat.com> Reviewed-by: Hannes Reinecke <hare@suse.com> Signed-off-by: Jens Axboe <axboe@fb.com>
show more ...
|
#
0ff9d4e1 |
| 12-May-2016 |
Keith Busch <keith.busch@intel.com> |
NVMe: Short-cut removal on surprise hot-unplug
This patch adds a new state that when set has the core automatically kill request queues prior to removing namespaces.
If PCI device is not present at
NVMe: Short-cut removal on surprise hot-unplug
This patch adds a new state that when set has the core automatically kill request queues prior to removing namespaces.
If PCI device is not present at the time the nvme driver's remove is called, we can kill all IO queues immediately instead of waiting for the watchdog thread to do that at its polling interval. This improves scenarios where multiple hot plug events occur at the same time since it doesn't block the pci enumeration for as long.
Signed-off-by: Keith Busch <keith.busch@intel.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@fb.com>
show more ...
|
#
6904242d |
| 25-Apr-2016 |
Ming Lin <ming.l@ssi.samsung.com> |
nvme: add helper nvme_cleanup_cmd()
This hides command cleanup into nvme.h and fabrics drivers will also use it.
Signed-off-by: Ming Lin <ming.l@ssi.samsung.com> Reviewed-by: Christoph Hellwig <hch
nvme: add helper nvme_cleanup_cmd()
This hides command cleanup into nvme.h and fabrics drivers will also use it.
Signed-off-by: Ming Lin <ming.l@ssi.samsung.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@fb.com>
show more ...
|
#
f866fc42 |
| 26-Apr-2016 |
Christoph Hellwig <hch@lst.de> |
nvme: move AER handling to common code
The transport driver still needs to do the actual submission, but all the higher level code can be shared.
Signed-off-by: Christoph Hellwig <hch@lst.de> Revie
nvme: move AER handling to common code
The transport driver still needs to do the actual submission, but all the higher level code can be shared.
Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Jens Axboe <axboe@fb.com>
show more ...
|
#
5955be21 |
| 26-Apr-2016 |
Christoph Hellwig <hch@lst.de> |
nvme: move namespace scanning to core
Move the scan work item and surrounding code to the common code. For now we need a new finish_scan method to allow the PCI driver to set the irq affinity hints
nvme: move namespace scanning to core
Move the scan work item and surrounding code to the common code. For now we need a new finish_scan method to allow the PCI driver to set the irq affinity hints, but I have plans in the works to obsolete this as well.
Note that this moves the namespace scanning from nvme_wq to the system workqueue, but as we don't rely on namespace scanning to finish from reset or I/O this should be fine.
Signed-off-by: Christoph Hellwig <hch@lst.de> Acked-by Jon Derrick: <jonathan.derrick@intel.com> Signed-off-by: Jens Axboe <axboe@fb.com>
show more ...
|
#
bb8d261e |
| 26-Apr-2016 |
Christoph Hellwig <hch@lst.de> |
nvme: introduce a controller state machine
Replace the adhoc flags in the PCI driver with a state machine in the core code. Based on code from Sagi Grimberg for the Fabrics driver.
Signed-off-by:
nvme: introduce a controller state machine
Replace the adhoc flags in the PCI driver with a state machine in the core code. Based on code from Sagi Grimberg for the Fabrics driver.
Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Acked-by Jon Derrick: <jonathan.derrick@intel.com> Signed-off-by: Jens Axboe <axboe@fb.com>
show more ...
|
#
04a934d4 |
| 26-Apr-2016 |
Christoph Hellwig <hch@lst.de> |
nvme: remove the io_incapable method
It's unused since "NVMe: Move error handling to failed reset handler".
Signed-off-by: Christoph Hellwig <hch@lst.de> Acked-by: Jon Derrick <jonathan.derrick@int
nvme: remove the io_incapable method
It's unused since "NVMe: Move error handling to failed reset handler".
Signed-off-by: Christoph Hellwig <hch@lst.de> Acked-by: Jon Derrick <jonathan.derrick@intel.com> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Jens Axboe <axboe@fb.com>
show more ...
|
#
76e3914a |
| 16-Apr-2016 |
Christoph Hellwig <hch@lst.de> |
nvme: fix cntlid type
Controller IDs in NVMe are unsigned 16-bit types. In the Fabrics driver we actually pass ctrl->id by reference, so we need it to have the correct type.
Signed-off-by: Christo
nvme: fix cntlid type
Controller IDs in NVMe are unsigned 16-bit types. In the Fabrics driver we actually pass ctrl->id by reference, so we need it to have the correct type.
Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@fb.com>
show more ...
|
#
8093f7ca |
| 12-Apr-2016 |
Ming Lin <mlin@kernel.org> |
nvme: add helper nvme_setup_cmd()
This moves nvme_setup_{flush,discard,rw} calls into a common nvme_setup_cmd() helper. So we can eventually hide all the command setup in the core module and don't e
nvme: add helper nvme_setup_cmd()
This moves nvme_setup_{flush,discard,rw} calls into a common nvme_setup_cmd() helper. So we can eventually hide all the command setup in the core module and don't even need to update the fabrics drivers for any specific command type.
Signed-off-by: Ming Lin <ming.l@ssi.samsung.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@fb.com>
show more ...
|
#
58b45602 |
| 22-Mar-2016 |
Ming Lin <ming.l@ssi.samsung.com> |
nvme: add helper nvme_map_len()
The helper returns the number of bytes that need to be mapped using PRPs/SGL entries.
Signed-off-by: Ming Lin <ming.l@ssi.samsung.com> Reviewed-by: Christoph Hellwig
nvme: add helper nvme_map_len()
The helper returns the number of bytes that need to be mapped using PRPs/SGL entries.
Signed-off-by: Ming Lin <ming.l@ssi.samsung.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@fb.com>
show more ...
|
#
118472ab |
| 18-Feb-2016 |
Keith Busch <keith.busch@intel.com> |
NVMe: Expose ns wwid through single sysfs entry
The method to uniquely identify a namespace depends on the controller's specification revision level and implemented capabilities. This patch has the
NVMe: Expose ns wwid through single sysfs entry
The method to uniquely identify a namespace depends on the controller's specification revision level and implemented capabilities. This patch has the driver figure this out and exports the unique string through a single 'wwid' attribute so the user doesn't have this burden.
The longest namespace unique identifier is used if available. If not available, the driver will concat the controller's vendor, serial, and model with the namespace ID. The specification provides this as a unique indentifier.
Signed-off-by: Keith Busch <keith.busch@intel.com> Reviewed-by: Hannes Reinecke <hare@suse.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Sagi Grimberg <sagig@mellanox.com> Signed-off-by: Jens Axboe <axboe@fb.com>
show more ...
|
#
08095e70 |
| 04-Mar-2016 |
Keith Busch <keith.busch@intel.com> |
NVMe: Create discard zero quirk white list
The NVMe specification does not require discarded blocks return zeroes on read, but provides that behavior as a possibility. Some applications more efficie
NVMe: Create discard zero quirk white list
The NVMe specification does not require discarded blocks return zeroes on read, but provides that behavior as a possibility. Some applications more efficiently use an SSD if reads on discarded blocks were deterministically zero, based on the "discard_zeroes_data" queue attribute.
There is no specification defined way to determine device behavior on discarded blocks, so the driver always left the queue setting disabled. We can only know behavior based on individual device models, so this patch adds a flag to the NVMe "quirk" list that vendors may set if they know their controller works that way. The patch also sets the new flag for one such known device.
Signed-off-by: Keith Busch <keith.busch@intel.com> Suggested-by: Artur Paszkiewicz <artur.paszkiewicz@intel.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de> Reviewed-by: Sagi Grimberg <sagig@mellanox.com> Signed-off-by: Jens Axboe <axboe@fb.com>
show more ...
|
#
69d9a99c |
| 24-Feb-2016 |
Keith Busch <keith.busch@intel.com> |
NVMe: Move error handling to failed reset handler
This moves failed queue handling out of the namespace removal path and into the reset failure path, fixing a hanging condition if the controller fai
NVMe: Move error handling to failed reset handler
This moves failed queue handling out of the namespace removal path and into the reset failure path, fixing a hanging condition if the controller fails or link down during del_gendisk. Previously the driver had to see the controller as degraded prior to calling del_gendisk to setup the queues to fail. But, if the controller happened to fail after this, there was no task to end outstanding requests.
On failure, all namespace states are set to dead. This has capacity revalidate to 0, and ends all new requests with error status.
Signed-off-by: Keith Busch <keith.busch@intel.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de> Signed-off-by: Jens Axboe <axboe@fb.com>
show more ...
|
#
646017a6 |
| 24-Feb-2016 |
Keith Busch <keith.busch@intel.com> |
NVMe: Fix namespace removal deadlock
This patch makes nvme namespace removal lockless. It is up to the caller to ensure no active namespace scanning is occuring. To ensure no scan work occurs, the n
NVMe: Fix namespace removal deadlock
This patch makes nvme namespace removal lockless. It is up to the caller to ensure no active namespace scanning is occuring. To ensure no scan work occurs, the nvme pci driver adds a removing state to the controller device to avoid queueing scan work during removal. The work is flushed after setting the state, so no new scan work can be queued.
The lockless removal allows the driver to cleanup a namespace request_queue if the controller fails during removal. Previously this could deadlock trying to acquire the namespace mutex in order to handle such events.
Signed-off-by: Keith Busch <keith.busch@intel.com> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@fb.com>
show more ...
|