1*4d2e26a3SMauro Carvalho Chehab================================ 2*4d2e26a3SMauro Carvalho ChehabCoherent Accelerator (CXL) Flash 3*4d2e26a3SMauro Carvalho Chehab================================ 4*4d2e26a3SMauro Carvalho Chehab 5*4d2e26a3SMauro Carvalho ChehabIntroduction 6*4d2e26a3SMauro Carvalho Chehab============ 7*4d2e26a3SMauro Carvalho Chehab 8*4d2e26a3SMauro Carvalho Chehab The IBM Power architecture provides support for CAPI (Coherent 9*4d2e26a3SMauro Carvalho Chehab Accelerator Power Interface), which is available to certain PCIe slots 10*4d2e26a3SMauro Carvalho Chehab on Power 8 systems. CAPI can be thought of as a special tunneling 11*4d2e26a3SMauro Carvalho Chehab protocol through PCIe that allow PCIe adapters to look like special 12*4d2e26a3SMauro Carvalho Chehab purpose co-processors which can read or write an application's 13*4d2e26a3SMauro Carvalho Chehab memory and generate page faults. As a result, the host interface to 14*4d2e26a3SMauro Carvalho Chehab an adapter running in CAPI mode does not require the data buffers to 15*4d2e26a3SMauro Carvalho Chehab be mapped to the device's memory (IOMMU bypass) nor does it require 16*4d2e26a3SMauro Carvalho Chehab memory to be pinned. 17*4d2e26a3SMauro Carvalho Chehab 18*4d2e26a3SMauro Carvalho Chehab On Linux, Coherent Accelerator (CXL) kernel services present CAPI 19*4d2e26a3SMauro Carvalho Chehab devices as a PCI device by implementing a virtual PCI host bridge. 20*4d2e26a3SMauro Carvalho Chehab This abstraction simplifies the infrastructure and programming 21*4d2e26a3SMauro Carvalho Chehab model, allowing for drivers to look similar to other native PCI 22*4d2e26a3SMauro Carvalho Chehab device drivers. 23*4d2e26a3SMauro Carvalho Chehab 24*4d2e26a3SMauro Carvalho Chehab CXL provides a mechanism by which user space applications can 25*4d2e26a3SMauro Carvalho Chehab directly talk to a device (network or storage) bypassing the typical 26*4d2e26a3SMauro Carvalho Chehab kernel/device driver stack. The CXL Flash Adapter Driver enables a 27*4d2e26a3SMauro Carvalho Chehab user space application direct access to Flash storage. 28*4d2e26a3SMauro Carvalho Chehab 29*4d2e26a3SMauro Carvalho Chehab The CXL Flash Adapter Driver is a kernel module that sits in the 30*4d2e26a3SMauro Carvalho Chehab SCSI stack as a low level device driver (below the SCSI disk and 31*4d2e26a3SMauro Carvalho Chehab protocol drivers) for the IBM CXL Flash Adapter. This driver is 32*4d2e26a3SMauro Carvalho Chehab responsible for the initialization of the adapter, setting up the 33*4d2e26a3SMauro Carvalho Chehab special path for user space access, and performing error recovery. It 34*4d2e26a3SMauro Carvalho Chehab communicates directly the Flash Accelerator Functional Unit (AFU) 35*4d2e26a3SMauro Carvalho Chehab as described in Documentation/powerpc/cxl.rst. 36*4d2e26a3SMauro Carvalho Chehab 37*4d2e26a3SMauro Carvalho Chehab The cxlflash driver supports two, mutually exclusive, modes of 38*4d2e26a3SMauro Carvalho Chehab operation at the device (LUN) level: 39*4d2e26a3SMauro Carvalho Chehab 40*4d2e26a3SMauro Carvalho Chehab - Any flash device (LUN) can be configured to be accessed as a 41*4d2e26a3SMauro Carvalho Chehab regular disk device (i.e.: /dev/sdc). This is the default mode. 42*4d2e26a3SMauro Carvalho Chehab 43*4d2e26a3SMauro Carvalho Chehab - Any flash device (LUN) can be configured to be accessed from 44*4d2e26a3SMauro Carvalho Chehab user space with a special block library. This mode further 45*4d2e26a3SMauro Carvalho Chehab specifies the means of accessing the device and provides for 46*4d2e26a3SMauro Carvalho Chehab either raw access to the entire LUN (referred to as direct 47*4d2e26a3SMauro Carvalho Chehab or physical LUN access) or access to a kernel/AFU-mediated 48*4d2e26a3SMauro Carvalho Chehab partition of the LUN (referred to as virtual LUN access). The 49*4d2e26a3SMauro Carvalho Chehab segmentation of a disk device into virtual LUNs is assisted 50*4d2e26a3SMauro Carvalho Chehab by special translation services provided by the Flash AFU. 51*4d2e26a3SMauro Carvalho Chehab 52*4d2e26a3SMauro Carvalho ChehabOverview 53*4d2e26a3SMauro Carvalho Chehab======== 54*4d2e26a3SMauro Carvalho Chehab 55*4d2e26a3SMauro Carvalho Chehab The Coherent Accelerator Interface Architecture (CAIA) introduces a 56*4d2e26a3SMauro Carvalho Chehab concept of a master context. A master typically has special privileges 57*4d2e26a3SMauro Carvalho Chehab granted to it by the kernel or hypervisor allowing it to perform AFU 58*4d2e26a3SMauro Carvalho Chehab wide management and control. The master may or may not be involved 59*4d2e26a3SMauro Carvalho Chehab directly in each user I/O, but at the minimum is involved in the 60*4d2e26a3SMauro Carvalho Chehab initial setup before the user application is allowed to send requests 61*4d2e26a3SMauro Carvalho Chehab directly to the AFU. 62*4d2e26a3SMauro Carvalho Chehab 63*4d2e26a3SMauro Carvalho Chehab The CXL Flash Adapter Driver establishes a master context with the 64*4d2e26a3SMauro Carvalho Chehab AFU. It uses memory mapped I/O (MMIO) for this control and setup. The 65*4d2e26a3SMauro Carvalho Chehab Adapter Problem Space Memory Map looks like this:: 66*4d2e26a3SMauro Carvalho Chehab 67*4d2e26a3SMauro Carvalho Chehab +-------------------------------+ 68*4d2e26a3SMauro Carvalho Chehab | 512 * 64 KB User MMIO | 69*4d2e26a3SMauro Carvalho Chehab | (per context) | 70*4d2e26a3SMauro Carvalho Chehab | User Accessible | 71*4d2e26a3SMauro Carvalho Chehab +-------------------------------+ 72*4d2e26a3SMauro Carvalho Chehab | 512 * 128 B per context | 73*4d2e26a3SMauro Carvalho Chehab | Provisioning and Control | 74*4d2e26a3SMauro Carvalho Chehab | Trusted Process accessible | 75*4d2e26a3SMauro Carvalho Chehab +-------------------------------+ 76*4d2e26a3SMauro Carvalho Chehab | 64 KB Global | 77*4d2e26a3SMauro Carvalho Chehab | Trusted Process accessible | 78*4d2e26a3SMauro Carvalho Chehab +-------------------------------+ 79*4d2e26a3SMauro Carvalho Chehab 80*4d2e26a3SMauro Carvalho Chehab This driver configures itself into the SCSI software stack as an 81*4d2e26a3SMauro Carvalho Chehab adapter driver. The driver is the only entity that is considered a 82*4d2e26a3SMauro Carvalho Chehab Trusted Process to program the Provisioning and Control and Global 83*4d2e26a3SMauro Carvalho Chehab areas in the MMIO Space shown above. The master context driver 84*4d2e26a3SMauro Carvalho Chehab discovers all LUNs attached to the CXL Flash adapter and instantiates 85*4d2e26a3SMauro Carvalho Chehab scsi block devices (/dev/sdb, /dev/sdc etc.) for each unique LUN 86*4d2e26a3SMauro Carvalho Chehab seen from each path. 87*4d2e26a3SMauro Carvalho Chehab 88*4d2e26a3SMauro Carvalho Chehab Once these scsi block devices are instantiated, an application 89*4d2e26a3SMauro Carvalho Chehab written to a specification provided by the block library may get 90*4d2e26a3SMauro Carvalho Chehab access to the Flash from user space (without requiring a system call). 91*4d2e26a3SMauro Carvalho Chehab 92*4d2e26a3SMauro Carvalho Chehab This master context driver also provides a series of ioctls for this 93*4d2e26a3SMauro Carvalho Chehab block library to enable this user space access. The driver supports 94*4d2e26a3SMauro Carvalho Chehab two modes for accessing the block device. 95*4d2e26a3SMauro Carvalho Chehab 96*4d2e26a3SMauro Carvalho Chehab The first mode is called a virtual mode. In this mode a single scsi 97*4d2e26a3SMauro Carvalho Chehab block device (/dev/sdb) may be carved up into any number of distinct 98*4d2e26a3SMauro Carvalho Chehab virtual LUNs. The virtual LUNs may be resized as long as the sum of 99*4d2e26a3SMauro Carvalho Chehab the sizes of all the virtual LUNs, along with the meta-data associated 100*4d2e26a3SMauro Carvalho Chehab with it does not exceed the physical capacity. 101*4d2e26a3SMauro Carvalho Chehab 102*4d2e26a3SMauro Carvalho Chehab The second mode is called the physical mode. In this mode a single 103*4d2e26a3SMauro Carvalho Chehab block device (/dev/sdb) may be opened directly by the block library 104*4d2e26a3SMauro Carvalho Chehab and the entire space for the LUN is available to the application. 105*4d2e26a3SMauro Carvalho Chehab 106*4d2e26a3SMauro Carvalho Chehab Only the physical mode provides persistence of the data. i.e. The 107*4d2e26a3SMauro Carvalho Chehab data written to the block device will survive application exit and 108*4d2e26a3SMauro Carvalho Chehab restart and also reboot. The virtual LUNs do not persist (i.e. do 109*4d2e26a3SMauro Carvalho Chehab not survive after the application terminates or the system reboots). 110*4d2e26a3SMauro Carvalho Chehab 111*4d2e26a3SMauro Carvalho Chehab 112*4d2e26a3SMauro Carvalho ChehabBlock library API 113*4d2e26a3SMauro Carvalho Chehab================= 114*4d2e26a3SMauro Carvalho Chehab 115*4d2e26a3SMauro Carvalho Chehab Applications intending to get access to the CXL Flash from user 116*4d2e26a3SMauro Carvalho Chehab space should use the block library, as it abstracts the details of 117*4d2e26a3SMauro Carvalho Chehab interfacing directly with the cxlflash driver that are necessary for 118*4d2e26a3SMauro Carvalho Chehab performing administrative actions (i.e.: setup, tear down, resize). 119*4d2e26a3SMauro Carvalho Chehab The block library can be thought of as a 'user' of services, 120*4d2e26a3SMauro Carvalho Chehab implemented as IOCTLs, that are provided by the cxlflash driver 121*4d2e26a3SMauro Carvalho Chehab specifically for devices (LUNs) operating in user space access 122*4d2e26a3SMauro Carvalho Chehab mode. While it is not a requirement that applications understand 123*4d2e26a3SMauro Carvalho Chehab the interface between the block library and the cxlflash driver, 124*4d2e26a3SMauro Carvalho Chehab a high-level overview of each supported service (IOCTL) is provided 125*4d2e26a3SMauro Carvalho Chehab below. 126*4d2e26a3SMauro Carvalho Chehab 127*4d2e26a3SMauro Carvalho Chehab The block library can be found on GitHub: 128*4d2e26a3SMauro Carvalho Chehab http://github.com/open-power/capiflash 129*4d2e26a3SMauro Carvalho Chehab 130*4d2e26a3SMauro Carvalho Chehab 131*4d2e26a3SMauro Carvalho ChehabCXL Flash Driver LUN IOCTLs 132*4d2e26a3SMauro Carvalho Chehab=========================== 133*4d2e26a3SMauro Carvalho Chehab 134*4d2e26a3SMauro Carvalho Chehab Users, such as the block library, that wish to interface with a flash 135*4d2e26a3SMauro Carvalho Chehab device (LUN) via user space access need to use the services provided 136*4d2e26a3SMauro Carvalho Chehab by the cxlflash driver. As these services are implemented as ioctls, 137*4d2e26a3SMauro Carvalho Chehab a file descriptor handle must first be obtained in order to establish 138*4d2e26a3SMauro Carvalho Chehab the communication channel between a user and the kernel. This file 139*4d2e26a3SMauro Carvalho Chehab descriptor is obtained by opening the device special file associated 140*4d2e26a3SMauro Carvalho Chehab with the scsi disk device (/dev/sdb) that was created during LUN 141*4d2e26a3SMauro Carvalho Chehab discovery. As per the location of the cxlflash driver within the 142*4d2e26a3SMauro Carvalho Chehab SCSI protocol stack, this open is actually not seen by the cxlflash 143*4d2e26a3SMauro Carvalho Chehab driver. Upon successful open, the user receives a file descriptor 144*4d2e26a3SMauro Carvalho Chehab (herein referred to as fd1) that should be used for issuing the 145*4d2e26a3SMauro Carvalho Chehab subsequent ioctls listed below. 146*4d2e26a3SMauro Carvalho Chehab 147*4d2e26a3SMauro Carvalho Chehab The structure definitions for these IOCTLs are available in: 148*4d2e26a3SMauro Carvalho Chehab uapi/scsi/cxlflash_ioctl.h 149*4d2e26a3SMauro Carvalho Chehab 150*4d2e26a3SMauro Carvalho ChehabDK_CXLFLASH_ATTACH 151*4d2e26a3SMauro Carvalho Chehab------------------ 152*4d2e26a3SMauro Carvalho Chehab 153*4d2e26a3SMauro Carvalho Chehab This ioctl obtains, initializes, and starts a context using the CXL 154*4d2e26a3SMauro Carvalho Chehab kernel services. These services specify a context id (u16) by which 155*4d2e26a3SMauro Carvalho Chehab to uniquely identify the context and its allocated resources. The 156*4d2e26a3SMauro Carvalho Chehab services additionally provide a second file descriptor (herein 157*4d2e26a3SMauro Carvalho Chehab referred to as fd2) that is used by the block library to initiate 158*4d2e26a3SMauro Carvalho Chehab memory mapped I/O (via mmap()) to the CXL flash device and poll for 159*4d2e26a3SMauro Carvalho Chehab completion events. This file descriptor is intentionally installed by 160*4d2e26a3SMauro Carvalho Chehab this driver and not the CXL kernel services to allow for intermediary 161*4d2e26a3SMauro Carvalho Chehab notification and access in the event of a non-user-initiated close(), 162*4d2e26a3SMauro Carvalho Chehab such as a killed process. This design point is described in further 163*4d2e26a3SMauro Carvalho Chehab detail in the description for the DK_CXLFLASH_DETACH ioctl. 164*4d2e26a3SMauro Carvalho Chehab 165*4d2e26a3SMauro Carvalho Chehab There are a few important aspects regarding the "tokens" (context id 166*4d2e26a3SMauro Carvalho Chehab and fd2) that are provided back to the user: 167*4d2e26a3SMauro Carvalho Chehab 168*4d2e26a3SMauro Carvalho Chehab - These tokens are only valid for the process under which they 169*4d2e26a3SMauro Carvalho Chehab were created. The child of a forked process cannot continue 170*4d2e26a3SMauro Carvalho Chehab to use the context id or file descriptor created by its parent 171*4d2e26a3SMauro Carvalho Chehab (see DK_CXLFLASH_VLUN_CLONE for further details). 172*4d2e26a3SMauro Carvalho Chehab 173*4d2e26a3SMauro Carvalho Chehab - These tokens are only valid for the lifetime of the context and 174*4d2e26a3SMauro Carvalho Chehab the process under which they were created. Once either is 175*4d2e26a3SMauro Carvalho Chehab destroyed, the tokens are to be considered stale and subsequent 176*4d2e26a3SMauro Carvalho Chehab usage will result in errors. 177*4d2e26a3SMauro Carvalho Chehab 178*4d2e26a3SMauro Carvalho Chehab - A valid adapter file descriptor (fd2 >= 0) is only returned on 179*4d2e26a3SMauro Carvalho Chehab the initial attach for a context. Subsequent attaches to an 180*4d2e26a3SMauro Carvalho Chehab existing context (DK_CXLFLASH_ATTACH_REUSE_CONTEXT flag present) 181*4d2e26a3SMauro Carvalho Chehab do not provide the adapter file descriptor as it was previously 182*4d2e26a3SMauro Carvalho Chehab made known to the application. 183*4d2e26a3SMauro Carvalho Chehab 184*4d2e26a3SMauro Carvalho Chehab - When a context is no longer needed, the user shall detach from 185*4d2e26a3SMauro Carvalho Chehab the context via the DK_CXLFLASH_DETACH ioctl. When this ioctl 186*4d2e26a3SMauro Carvalho Chehab returns with a valid adapter file descriptor and the return flag 187*4d2e26a3SMauro Carvalho Chehab DK_CXLFLASH_APP_CLOSE_ADAP_FD is present, the application _must_ 188*4d2e26a3SMauro Carvalho Chehab close the adapter file descriptor following a successful detach. 189*4d2e26a3SMauro Carvalho Chehab 190*4d2e26a3SMauro Carvalho Chehab - When this ioctl returns with a valid fd2 and the return flag 191*4d2e26a3SMauro Carvalho Chehab DK_CXLFLASH_APP_CLOSE_ADAP_FD is present, the application _must_ 192*4d2e26a3SMauro Carvalho Chehab close fd2 in the following circumstances: 193*4d2e26a3SMauro Carvalho Chehab 194*4d2e26a3SMauro Carvalho Chehab + Following a successful detach of the last user of the context 195*4d2e26a3SMauro Carvalho Chehab + Following a successful recovery on the context's original fd2 196*4d2e26a3SMauro Carvalho Chehab + In the child process of a fork(), following a clone ioctl, 197*4d2e26a3SMauro Carvalho Chehab on the fd2 associated with the source context 198*4d2e26a3SMauro Carvalho Chehab 199*4d2e26a3SMauro Carvalho Chehab - At any time, a close on fd2 will invalidate the tokens. Applications 200*4d2e26a3SMauro Carvalho Chehab should exercise caution to only close fd2 when appropriate (outlined 201*4d2e26a3SMauro Carvalho Chehab in the previous bullet) to avoid premature loss of I/O. 202*4d2e26a3SMauro Carvalho Chehab 203*4d2e26a3SMauro Carvalho ChehabDK_CXLFLASH_USER_DIRECT 204*4d2e26a3SMauro Carvalho Chehab----------------------- 205*4d2e26a3SMauro Carvalho Chehab This ioctl is responsible for transitioning the LUN to direct 206*4d2e26a3SMauro Carvalho Chehab (physical) mode access and configuring the AFU for direct access from 207*4d2e26a3SMauro Carvalho Chehab user space on a per-context basis. Additionally, the block size and 208*4d2e26a3SMauro Carvalho Chehab last logical block address (LBA) are returned to the user. 209*4d2e26a3SMauro Carvalho Chehab 210*4d2e26a3SMauro Carvalho Chehab As mentioned previously, when operating in user space access mode, 211*4d2e26a3SMauro Carvalho Chehab LUNs may be accessed in whole or in part. Only one mode is allowed 212*4d2e26a3SMauro Carvalho Chehab at a time and if one mode is active (outstanding references exist), 213*4d2e26a3SMauro Carvalho Chehab requests to use the LUN in a different mode are denied. 214*4d2e26a3SMauro Carvalho Chehab 215*4d2e26a3SMauro Carvalho Chehab The AFU is configured for direct access from user space by adding an 216*4d2e26a3SMauro Carvalho Chehab entry to the AFU's resource handle table. The index of the entry is 217*4d2e26a3SMauro Carvalho Chehab treated as a resource handle that is returned to the user. The user 218*4d2e26a3SMauro Carvalho Chehab is then able to use the handle to reference the LUN during I/O. 219*4d2e26a3SMauro Carvalho Chehab 220*4d2e26a3SMauro Carvalho ChehabDK_CXLFLASH_USER_VIRTUAL 221*4d2e26a3SMauro Carvalho Chehab------------------------ 222*4d2e26a3SMauro Carvalho Chehab This ioctl is responsible for transitioning the LUN to virtual mode 223*4d2e26a3SMauro Carvalho Chehab of access and configuring the AFU for virtual access from user space 224*4d2e26a3SMauro Carvalho Chehab on a per-context basis. Additionally, the block size and last logical 225*4d2e26a3SMauro Carvalho Chehab block address (LBA) are returned to the user. 226*4d2e26a3SMauro Carvalho Chehab 227*4d2e26a3SMauro Carvalho Chehab As mentioned previously, when operating in user space access mode, 228*4d2e26a3SMauro Carvalho Chehab LUNs may be accessed in whole or in part. Only one mode is allowed 229*4d2e26a3SMauro Carvalho Chehab at a time and if one mode is active (outstanding references exist), 230*4d2e26a3SMauro Carvalho Chehab requests to use the LUN in a different mode are denied. 231*4d2e26a3SMauro Carvalho Chehab 232*4d2e26a3SMauro Carvalho Chehab The AFU is configured for virtual access from user space by adding 233*4d2e26a3SMauro Carvalho Chehab an entry to the AFU's resource handle table. The index of the entry 234*4d2e26a3SMauro Carvalho Chehab is treated as a resource handle that is returned to the user. The 235*4d2e26a3SMauro Carvalho Chehab user is then able to use the handle to reference the LUN during I/O. 236*4d2e26a3SMauro Carvalho Chehab 237*4d2e26a3SMauro Carvalho Chehab By default, the virtual LUN is created with a size of 0. The user 238*4d2e26a3SMauro Carvalho Chehab would need to use the DK_CXLFLASH_VLUN_RESIZE ioctl to adjust the grow 239*4d2e26a3SMauro Carvalho Chehab the virtual LUN to a desired size. To avoid having to perform this 240*4d2e26a3SMauro Carvalho Chehab resize for the initial creation of the virtual LUN, the user has the 241*4d2e26a3SMauro Carvalho Chehab option of specifying a size as part of the DK_CXLFLASH_USER_VIRTUAL 242*4d2e26a3SMauro Carvalho Chehab ioctl, such that when success is returned to the user, the 243*4d2e26a3SMauro Carvalho Chehab resource handle that is provided is already referencing provisioned 244*4d2e26a3SMauro Carvalho Chehab storage. This is reflected by the last LBA being a non-zero value. 245*4d2e26a3SMauro Carvalho Chehab 246*4d2e26a3SMauro Carvalho Chehab When a LUN is accessible from more than one port, this ioctl will 247*4d2e26a3SMauro Carvalho Chehab return with the DK_CXLFLASH_ALL_PORTS_ACTIVE return flag set. This 248*4d2e26a3SMauro Carvalho Chehab provides the user with a hint that I/O can be retried in the event 249*4d2e26a3SMauro Carvalho Chehab of an I/O error as the LUN can be reached over multiple paths. 250*4d2e26a3SMauro Carvalho Chehab 251*4d2e26a3SMauro Carvalho ChehabDK_CXLFLASH_VLUN_RESIZE 252*4d2e26a3SMauro Carvalho Chehab----------------------- 253*4d2e26a3SMauro Carvalho Chehab This ioctl is responsible for resizing a previously created virtual 254*4d2e26a3SMauro Carvalho Chehab LUN and will fail if invoked upon a LUN that is not in virtual 255*4d2e26a3SMauro Carvalho Chehab mode. Upon success, an updated last LBA is returned to the user 256*4d2e26a3SMauro Carvalho Chehab indicating the new size of the virtual LUN associated with the 257*4d2e26a3SMauro Carvalho Chehab resource handle. 258*4d2e26a3SMauro Carvalho Chehab 259*4d2e26a3SMauro Carvalho Chehab The partitioning of virtual LUNs is jointly mediated by the cxlflash 260*4d2e26a3SMauro Carvalho Chehab driver and the AFU. An allocation table is kept for each LUN that is 261*4d2e26a3SMauro Carvalho Chehab operating in the virtual mode and used to program a LUN translation 262*4d2e26a3SMauro Carvalho Chehab table that the AFU references when provided with a resource handle. 263*4d2e26a3SMauro Carvalho Chehab 264*4d2e26a3SMauro Carvalho Chehab This ioctl can return -EAGAIN if an AFU sync operation takes too long. 265*4d2e26a3SMauro Carvalho Chehab In addition to returning a failure to user, cxlflash will also schedule 266*4d2e26a3SMauro Carvalho Chehab an asynchronous AFU reset. Should the user choose to retry the operation, 267*4d2e26a3SMauro Carvalho Chehab it is expected to succeed. If this ioctl fails with -EAGAIN, the user 268*4d2e26a3SMauro Carvalho Chehab can either retry the operation or treat it as a failure. 269*4d2e26a3SMauro Carvalho Chehab 270*4d2e26a3SMauro Carvalho ChehabDK_CXLFLASH_RELEASE 271*4d2e26a3SMauro Carvalho Chehab------------------- 272*4d2e26a3SMauro Carvalho Chehab This ioctl is responsible for releasing a previously obtained 273*4d2e26a3SMauro Carvalho Chehab reference to either a physical or virtual LUN. This can be 274*4d2e26a3SMauro Carvalho Chehab thought of as the inverse of the DK_CXLFLASH_USER_DIRECT or 275*4d2e26a3SMauro Carvalho Chehab DK_CXLFLASH_USER_VIRTUAL ioctls. Upon success, the resource handle 276*4d2e26a3SMauro Carvalho Chehab is no longer valid and the entry in the resource handle table is 277*4d2e26a3SMauro Carvalho Chehab made available to be used again. 278*4d2e26a3SMauro Carvalho Chehab 279*4d2e26a3SMauro Carvalho Chehab As part of the release process for virtual LUNs, the virtual LUN 280*4d2e26a3SMauro Carvalho Chehab is first resized to 0 to clear out and free the translation tables 281*4d2e26a3SMauro Carvalho Chehab associated with the virtual LUN reference. 282*4d2e26a3SMauro Carvalho Chehab 283*4d2e26a3SMauro Carvalho ChehabDK_CXLFLASH_DETACH 284*4d2e26a3SMauro Carvalho Chehab------------------ 285*4d2e26a3SMauro Carvalho Chehab This ioctl is responsible for unregistering a context with the 286*4d2e26a3SMauro Carvalho Chehab cxlflash driver and release outstanding resources that were 287*4d2e26a3SMauro Carvalho Chehab not explicitly released via the DK_CXLFLASH_RELEASE ioctl. Upon 288*4d2e26a3SMauro Carvalho Chehab success, all "tokens" which had been provided to the user from the 289*4d2e26a3SMauro Carvalho Chehab DK_CXLFLASH_ATTACH onward are no longer valid. 290*4d2e26a3SMauro Carvalho Chehab 291*4d2e26a3SMauro Carvalho Chehab When the DK_CXLFLASH_APP_CLOSE_ADAP_FD flag was returned on a successful 292*4d2e26a3SMauro Carvalho Chehab attach, the application _must_ close the fd2 associated with the context 293*4d2e26a3SMauro Carvalho Chehab following the detach of the final user of the context. 294*4d2e26a3SMauro Carvalho Chehab 295*4d2e26a3SMauro Carvalho ChehabDK_CXLFLASH_VLUN_CLONE 296*4d2e26a3SMauro Carvalho Chehab---------------------- 297*4d2e26a3SMauro Carvalho Chehab This ioctl is responsible for cloning a previously created 298*4d2e26a3SMauro Carvalho Chehab context to a more recently created context. It exists solely to 299*4d2e26a3SMauro Carvalho Chehab support maintaining user space access to storage after a process 300*4d2e26a3SMauro Carvalho Chehab forks. Upon success, the child process (which invoked the ioctl) 301*4d2e26a3SMauro Carvalho Chehab will have access to the same LUNs via the same resource handle(s) 302*4d2e26a3SMauro Carvalho Chehab as the parent, but under a different context. 303*4d2e26a3SMauro Carvalho Chehab 304*4d2e26a3SMauro Carvalho Chehab Context sharing across processes is not supported with CXL and 305*4d2e26a3SMauro Carvalho Chehab therefore each fork must be met with establishing a new context 306*4d2e26a3SMauro Carvalho Chehab for the child process. This ioctl simplifies the state management 307*4d2e26a3SMauro Carvalho Chehab and playback required by a user in such a scenario. When a process 308*4d2e26a3SMauro Carvalho Chehab forks, child process can clone the parents context by first creating 309*4d2e26a3SMauro Carvalho Chehab a context (via DK_CXLFLASH_ATTACH) and then using this ioctl to 310*4d2e26a3SMauro Carvalho Chehab perform the clone from the parent to the child. 311*4d2e26a3SMauro Carvalho Chehab 312*4d2e26a3SMauro Carvalho Chehab The clone itself is fairly simple. The resource handle and lun 313*4d2e26a3SMauro Carvalho Chehab translation tables are copied from the parent context to the child's 314*4d2e26a3SMauro Carvalho Chehab and then synced with the AFU. 315*4d2e26a3SMauro Carvalho Chehab 316*4d2e26a3SMauro Carvalho Chehab When the DK_CXLFLASH_APP_CLOSE_ADAP_FD flag was returned on a successful 317*4d2e26a3SMauro Carvalho Chehab attach, the application _must_ close the fd2 associated with the source 318*4d2e26a3SMauro Carvalho Chehab context (still resident/accessible in the parent process) following the 319*4d2e26a3SMauro Carvalho Chehab clone. This is to avoid a stale entry in the file descriptor table of the 320*4d2e26a3SMauro Carvalho Chehab child process. 321*4d2e26a3SMauro Carvalho Chehab 322*4d2e26a3SMauro Carvalho Chehab This ioctl can return -EAGAIN if an AFU sync operation takes too long. 323*4d2e26a3SMauro Carvalho Chehab In addition to returning a failure to user, cxlflash will also schedule 324*4d2e26a3SMauro Carvalho Chehab an asynchronous AFU reset. Should the user choose to retry the operation, 325*4d2e26a3SMauro Carvalho Chehab it is expected to succeed. If this ioctl fails with -EAGAIN, the user 326*4d2e26a3SMauro Carvalho Chehab can either retry the operation or treat it as a failure. 327*4d2e26a3SMauro Carvalho Chehab 328*4d2e26a3SMauro Carvalho ChehabDK_CXLFLASH_VERIFY 329*4d2e26a3SMauro Carvalho Chehab------------------ 330*4d2e26a3SMauro Carvalho Chehab This ioctl is used to detect various changes such as the capacity of 331*4d2e26a3SMauro Carvalho Chehab the disk changing, the number of LUNs visible changing, etc. In cases 332*4d2e26a3SMauro Carvalho Chehab where the changes affect the application (such as a LUN resize), the 333*4d2e26a3SMauro Carvalho Chehab cxlflash driver will report the changed state to the application. 334*4d2e26a3SMauro Carvalho Chehab 335*4d2e26a3SMauro Carvalho Chehab The user calls in when they want to validate that a LUN hasn't been 336*4d2e26a3SMauro Carvalho Chehab changed in response to a check condition. As the user is operating out 337*4d2e26a3SMauro Carvalho Chehab of band from the kernel, they will see these types of events without 338*4d2e26a3SMauro Carvalho Chehab the kernel's knowledge. When encountered, the user's architected 339*4d2e26a3SMauro Carvalho Chehab behavior is to call in to this ioctl, indicating what they want to 340*4d2e26a3SMauro Carvalho Chehab verify and passing along any appropriate information. For now, only 341*4d2e26a3SMauro Carvalho Chehab verifying a LUN change (ie: size different) with sense data is 342*4d2e26a3SMauro Carvalho Chehab supported. 343*4d2e26a3SMauro Carvalho Chehab 344*4d2e26a3SMauro Carvalho ChehabDK_CXLFLASH_RECOVER_AFU 345*4d2e26a3SMauro Carvalho Chehab----------------------- 346*4d2e26a3SMauro Carvalho Chehab This ioctl is used to drive recovery (if such an action is warranted) 347*4d2e26a3SMauro Carvalho Chehab of a specified user context. Any state associated with the user context 348*4d2e26a3SMauro Carvalho Chehab is re-established upon successful recovery. 349*4d2e26a3SMauro Carvalho Chehab 350*4d2e26a3SMauro Carvalho Chehab User contexts are put into an error condition when the device needs to 351*4d2e26a3SMauro Carvalho Chehab be reset or is terminating. Users are notified of this error condition 352*4d2e26a3SMauro Carvalho Chehab by seeing all 0xF's on an MMIO read. Upon encountering this, the 353*4d2e26a3SMauro Carvalho Chehab architected behavior for a user is to call into this ioctl to recover 354*4d2e26a3SMauro Carvalho Chehab their context. A user may also call into this ioctl at any time to 355*4d2e26a3SMauro Carvalho Chehab check if the device is operating normally. If a failure is returned 356*4d2e26a3SMauro Carvalho Chehab from this ioctl, the user is expected to gracefully clean up their 357*4d2e26a3SMauro Carvalho Chehab context via release/detach ioctls. Until they do, the context they 358*4d2e26a3SMauro Carvalho Chehab hold is not relinquished. The user may also optionally exit the process 359*4d2e26a3SMauro Carvalho Chehab at which time the context/resources they held will be freed as part of 360*4d2e26a3SMauro Carvalho Chehab the release fop. 361*4d2e26a3SMauro Carvalho Chehab 362*4d2e26a3SMauro Carvalho Chehab When the DK_CXLFLASH_APP_CLOSE_ADAP_FD flag was returned on a successful 363*4d2e26a3SMauro Carvalho Chehab attach, the application _must_ unmap and close the fd2 associated with the 364*4d2e26a3SMauro Carvalho Chehab original context following this ioctl returning success and indicating that 365*4d2e26a3SMauro Carvalho Chehab the context was recovered (DK_CXLFLASH_RECOVER_AFU_CONTEXT_RESET). 366*4d2e26a3SMauro Carvalho Chehab 367*4d2e26a3SMauro Carvalho ChehabDK_CXLFLASH_MANAGE_LUN 368*4d2e26a3SMauro Carvalho Chehab---------------------- 369*4d2e26a3SMauro Carvalho Chehab This ioctl is used to switch a LUN from a mode where it is available 370*4d2e26a3SMauro Carvalho Chehab for file-system access (legacy), to a mode where it is set aside for 371*4d2e26a3SMauro Carvalho Chehab exclusive user space access (superpipe). In case a LUN is visible 372*4d2e26a3SMauro Carvalho Chehab across multiple ports and adapters, this ioctl is used to uniquely 373*4d2e26a3SMauro Carvalho Chehab identify each LUN by its World Wide Node Name (WWNN). 374*4d2e26a3SMauro Carvalho Chehab 375*4d2e26a3SMauro Carvalho Chehab 376*4d2e26a3SMauro Carvalho ChehabCXL Flash Driver Host IOCTLs 377*4d2e26a3SMauro Carvalho Chehab============================ 378*4d2e26a3SMauro Carvalho Chehab 379*4d2e26a3SMauro Carvalho Chehab Each host adapter instance that is supported by the cxlflash driver 380*4d2e26a3SMauro Carvalho Chehab has a special character device associated with it to enable a set of 381*4d2e26a3SMauro Carvalho Chehab host management function. These character devices are hosted in a 382*4d2e26a3SMauro Carvalho Chehab class dedicated for cxlflash and can be accessed via `/dev/cxlflash/*`. 383*4d2e26a3SMauro Carvalho Chehab 384*4d2e26a3SMauro Carvalho Chehab Applications can be written to perform various functions using the 385*4d2e26a3SMauro Carvalho Chehab host ioctl APIs below. 386*4d2e26a3SMauro Carvalho Chehab 387*4d2e26a3SMauro Carvalho Chehab The structure definitions for these IOCTLs are available in: 388*4d2e26a3SMauro Carvalho Chehab uapi/scsi/cxlflash_ioctl.h 389*4d2e26a3SMauro Carvalho Chehab 390*4d2e26a3SMauro Carvalho ChehabHT_CXLFLASH_LUN_PROVISION 391*4d2e26a3SMauro Carvalho Chehab------------------------- 392*4d2e26a3SMauro Carvalho Chehab This ioctl is used to create and delete persistent LUNs on cxlflash 393*4d2e26a3SMauro Carvalho Chehab devices that lack an external LUN management interface. It is only 394*4d2e26a3SMauro Carvalho Chehab valid when used with AFUs that support the LUN provision capability. 395*4d2e26a3SMauro Carvalho Chehab 396*4d2e26a3SMauro Carvalho Chehab When sufficient space is available, LUNs can be created by specifying 397*4d2e26a3SMauro Carvalho Chehab the target port to host the LUN and a desired size in 4K blocks. Upon 398*4d2e26a3SMauro Carvalho Chehab success, the LUN ID and WWID of the created LUN will be returned and 399*4d2e26a3SMauro Carvalho Chehab the SCSI bus can be scanned to detect the change in LUN topology. Note 400*4d2e26a3SMauro Carvalho Chehab that partial allocations are not supported. Should a creation fail due 401*4d2e26a3SMauro Carvalho Chehab to a space issue, the target port can be queried for its current LUN 402*4d2e26a3SMauro Carvalho Chehab geometry. 403*4d2e26a3SMauro Carvalho Chehab 404*4d2e26a3SMauro Carvalho Chehab To remove a LUN, the device must first be disassociated from the Linux 405*4d2e26a3SMauro Carvalho Chehab SCSI subsystem. The LUN deletion can then be initiated by specifying a 406*4d2e26a3SMauro Carvalho Chehab target port and LUN ID. Upon success, the LUN geometry associated with 407*4d2e26a3SMauro Carvalho Chehab the port will be updated to reflect new number of provisioned LUNs and 408*4d2e26a3SMauro Carvalho Chehab available capacity. 409*4d2e26a3SMauro Carvalho Chehab 410*4d2e26a3SMauro Carvalho Chehab To query the LUN geometry of a port, the target port is specified and 411*4d2e26a3SMauro Carvalho Chehab upon success, the following information is presented: 412*4d2e26a3SMauro Carvalho Chehab 413*4d2e26a3SMauro Carvalho Chehab - Maximum number of provisioned LUNs allowed for the port 414*4d2e26a3SMauro Carvalho Chehab - Current number of provisioned LUNs for the port 415*4d2e26a3SMauro Carvalho Chehab - Maximum total capacity of provisioned LUNs for the port (4K blocks) 416*4d2e26a3SMauro Carvalho Chehab - Current total capacity of provisioned LUNs for the port (4K blocks) 417*4d2e26a3SMauro Carvalho Chehab 418*4d2e26a3SMauro Carvalho Chehab With this information, the number of available LUNs and capacity can be 419*4d2e26a3SMauro Carvalho Chehab can be calculated. 420*4d2e26a3SMauro Carvalho Chehab 421*4d2e26a3SMauro Carvalho ChehabHT_CXLFLASH_AFU_DEBUG 422*4d2e26a3SMauro Carvalho Chehab--------------------- 423*4d2e26a3SMauro Carvalho Chehab This ioctl is used to debug AFUs by supporting a command pass-through 424*4d2e26a3SMauro Carvalho Chehab interface. It is only valid when used with AFUs that support the AFU 425*4d2e26a3SMauro Carvalho Chehab debug capability. 426*4d2e26a3SMauro Carvalho Chehab 427*4d2e26a3SMauro Carvalho Chehab With exception of buffer management, AFU debug commands are opaque to 428*4d2e26a3SMauro Carvalho Chehab cxlflash and treated as pass-through. For debug commands that do require 429*4d2e26a3SMauro Carvalho Chehab data transfer, the user supplies an adequately sized data buffer and must 430*4d2e26a3SMauro Carvalho Chehab specify the data transfer direction with respect to the host. There is a 431*4d2e26a3SMauro Carvalho Chehab maximum transfer size of 256K imposed. Note that partial read completions 432*4d2e26a3SMauro Carvalho Chehab are not supported - when errors are experienced with a host read data 433*4d2e26a3SMauro Carvalho Chehab transfer, the data buffer is not copied back to the user. 434