1*aa017ab9SKenneth Lee.. SPDX-License-Identifier: GPL-2.0 2*aa017ab9SKenneth Lee 3*aa017ab9SKenneth LeeIntroduction of Uacce 4*aa017ab9SKenneth Lee--------------------- 5*aa017ab9SKenneth Lee 6*aa017ab9SKenneth LeeUacce (Unified/User-space-access-intended Accelerator Framework) targets to 7*aa017ab9SKenneth Leeprovide Shared Virtual Addressing (SVA) between accelerators and processes. 8*aa017ab9SKenneth LeeSo accelerator can access any data structure of the main cpu. 9*aa017ab9SKenneth LeeThis differs from the data sharing between cpu and io device, which share 10*aa017ab9SKenneth Leeonly data content rather than address. 11*aa017ab9SKenneth LeeBecause of the unified address, hardware and user space of process can 12*aa017ab9SKenneth Leeshare the same virtual address in the communication. 13*aa017ab9SKenneth LeeUacce takes the hardware accelerator as a heterogeneous processor, while 14*aa017ab9SKenneth LeeIOMMU share the same CPU page tables and as a result the same translation 15*aa017ab9SKenneth Leefrom va to pa. 16*aa017ab9SKenneth Lee 17*aa017ab9SKenneth Lee:: 18*aa017ab9SKenneth Lee 19*aa017ab9SKenneth Lee __________________________ __________________________ 20*aa017ab9SKenneth Lee | | | | 21*aa017ab9SKenneth Lee | User application (CPU) | | Hardware Accelerator | 22*aa017ab9SKenneth Lee |__________________________| |__________________________| 23*aa017ab9SKenneth Lee 24*aa017ab9SKenneth Lee | | 25*aa017ab9SKenneth Lee | va | va 26*aa017ab9SKenneth Lee V V 27*aa017ab9SKenneth Lee __________ __________ 28*aa017ab9SKenneth Lee | | | | 29*aa017ab9SKenneth Lee | MMU | | IOMMU | 30*aa017ab9SKenneth Lee |__________| |__________| 31*aa017ab9SKenneth Lee | | 32*aa017ab9SKenneth Lee | | 33*aa017ab9SKenneth Lee V pa V pa 34*aa017ab9SKenneth Lee _______________________________________ 35*aa017ab9SKenneth Lee | | 36*aa017ab9SKenneth Lee | Memory | 37*aa017ab9SKenneth Lee |_______________________________________| 38*aa017ab9SKenneth Lee 39*aa017ab9SKenneth Lee 40*aa017ab9SKenneth Lee 41*aa017ab9SKenneth LeeArchitecture 42*aa017ab9SKenneth Lee------------ 43*aa017ab9SKenneth Lee 44*aa017ab9SKenneth LeeUacce is the kernel module, taking charge of iommu and address sharing. 45*aa017ab9SKenneth LeeThe user drivers and libraries are called WarpDrive. 46*aa017ab9SKenneth Lee 47*aa017ab9SKenneth LeeThe uacce device, built around the IOMMU SVA API, can access multiple 48*aa017ab9SKenneth Leeaddress spaces, including the one without PASID. 49*aa017ab9SKenneth Lee 50*aa017ab9SKenneth LeeA virtual concept, queue, is used for the communication. It provides a 51*aa017ab9SKenneth LeeFIFO-like interface. And it maintains a unified address space between the 52*aa017ab9SKenneth Leeapplication and all involved hardware. 53*aa017ab9SKenneth Lee 54*aa017ab9SKenneth Lee:: 55*aa017ab9SKenneth Lee 56*aa017ab9SKenneth Lee ___________________ ________________ 57*aa017ab9SKenneth Lee | | user API | | 58*aa017ab9SKenneth Lee | WarpDrive library | ------------> | user driver | 59*aa017ab9SKenneth Lee |___________________| |________________| 60*aa017ab9SKenneth Lee | | 61*aa017ab9SKenneth Lee | | 62*aa017ab9SKenneth Lee | queue fd | 63*aa017ab9SKenneth Lee | | 64*aa017ab9SKenneth Lee | | 65*aa017ab9SKenneth Lee v | 66*aa017ab9SKenneth Lee ___________________ _________ | 67*aa017ab9SKenneth Lee | | | | | mmap memory 68*aa017ab9SKenneth Lee | Other framework | | uacce | | r/w interface 69*aa017ab9SKenneth Lee | crypto/nic/others | |_________| | 70*aa017ab9SKenneth Lee |___________________| | 71*aa017ab9SKenneth Lee | | | 72*aa017ab9SKenneth Lee | register | register | 73*aa017ab9SKenneth Lee | | | 74*aa017ab9SKenneth Lee | | | 75*aa017ab9SKenneth Lee | _________________ __________ | 76*aa017ab9SKenneth Lee | | | | | | 77*aa017ab9SKenneth Lee ------------- | Device Driver | | IOMMU | | 78*aa017ab9SKenneth Lee |_________________| |__________| | 79*aa017ab9SKenneth Lee | | 80*aa017ab9SKenneth Lee | V 81*aa017ab9SKenneth Lee | ___________________ 82*aa017ab9SKenneth Lee | | | 83*aa017ab9SKenneth Lee -------------------------- | Device(Hardware) | 84*aa017ab9SKenneth Lee |___________________| 85*aa017ab9SKenneth Lee 86*aa017ab9SKenneth Lee 87*aa017ab9SKenneth LeeHow does it work 88*aa017ab9SKenneth Lee---------------- 89*aa017ab9SKenneth Lee 90*aa017ab9SKenneth LeeUacce uses mmap and IOMMU to play the trick. 91*aa017ab9SKenneth Lee 92*aa017ab9SKenneth LeeUacce creates a chrdev for every device registered to it. New queue is 93*aa017ab9SKenneth Leecreated when user application open the chrdev. The file descriptor is used 94*aa017ab9SKenneth Leeas the user handle of the queue. 95*aa017ab9SKenneth LeeThe accelerator device present itself as an Uacce object, which exports as 96*aa017ab9SKenneth Leea chrdev to the user space. The user application communicates with the 97*aa017ab9SKenneth Leehardware by ioctl (as control path) or share memory (as data path). 98*aa017ab9SKenneth Lee 99*aa017ab9SKenneth LeeThe control path to the hardware is via file operation, while data path is 100*aa017ab9SKenneth Leevia mmap space of the queue fd. 101*aa017ab9SKenneth Lee 102*aa017ab9SKenneth LeeThe queue file address space: 103*aa017ab9SKenneth Lee 104*aa017ab9SKenneth Lee:: 105*aa017ab9SKenneth Lee 106*aa017ab9SKenneth Lee /** 107*aa017ab9SKenneth Lee * enum uacce_qfrt: qfrt type 108*aa017ab9SKenneth Lee * @UACCE_QFRT_MMIO: device mmio region 109*aa017ab9SKenneth Lee * @UACCE_QFRT_DUS: device user share region 110*aa017ab9SKenneth Lee */ 111*aa017ab9SKenneth Lee enum uacce_qfrt { 112*aa017ab9SKenneth Lee UACCE_QFRT_MMIO = 0, 113*aa017ab9SKenneth Lee UACCE_QFRT_DUS = 1, 114*aa017ab9SKenneth Lee }; 115*aa017ab9SKenneth Lee 116*aa017ab9SKenneth LeeAll regions are optional and differ from device type to type. 117*aa017ab9SKenneth LeeEach region can be mmapped only once, otherwise -EEXIST returns. 118*aa017ab9SKenneth Lee 119*aa017ab9SKenneth LeeThe device mmio region is mapped to the hardware mmio space. It is generally 120*aa017ab9SKenneth Leeused for doorbell or other notification to the hardware. It is not fast enough 121*aa017ab9SKenneth Leeas data channel. 122*aa017ab9SKenneth Lee 123*aa017ab9SKenneth LeeThe device user share region is used for share data buffer between user process 124*aa017ab9SKenneth Leeand device. 125*aa017ab9SKenneth Lee 126*aa017ab9SKenneth Lee 127*aa017ab9SKenneth LeeThe Uacce register API 128*aa017ab9SKenneth Lee---------------------- 129*aa017ab9SKenneth Lee 130*aa017ab9SKenneth LeeThe register API is defined in uacce.h. 131*aa017ab9SKenneth Lee 132*aa017ab9SKenneth Lee:: 133*aa017ab9SKenneth Lee 134*aa017ab9SKenneth Lee struct uacce_interface { 135*aa017ab9SKenneth Lee char name[UACCE_MAX_NAME_SIZE]; 136*aa017ab9SKenneth Lee unsigned int flags; 137*aa017ab9SKenneth Lee const struct uacce_ops *ops; 138*aa017ab9SKenneth Lee }; 139*aa017ab9SKenneth Lee 140*aa017ab9SKenneth LeeAccording to the IOMMU capability, uacce_interface flags can be: 141*aa017ab9SKenneth Lee 142*aa017ab9SKenneth Lee:: 143*aa017ab9SKenneth Lee 144*aa017ab9SKenneth Lee /** 145*aa017ab9SKenneth Lee * UACCE Device flags: 146*aa017ab9SKenneth Lee * UACCE_DEV_SVA: Shared Virtual Addresses 147*aa017ab9SKenneth Lee * Support PASID 148*aa017ab9SKenneth Lee * Support device page faults (PCI PRI or SMMU Stall) 149*aa017ab9SKenneth Lee */ 150*aa017ab9SKenneth Lee #define UACCE_DEV_SVA BIT(0) 151*aa017ab9SKenneth Lee 152*aa017ab9SKenneth Lee struct uacce_device *uacce_alloc(struct device *parent, 153*aa017ab9SKenneth Lee struct uacce_interface *interface); 154*aa017ab9SKenneth Lee int uacce_register(struct uacce_device *uacce); 155*aa017ab9SKenneth Lee void uacce_remove(struct uacce_device *uacce); 156*aa017ab9SKenneth Lee 157*aa017ab9SKenneth Leeuacce_register results can be: 158*aa017ab9SKenneth Lee 159*aa017ab9SKenneth Leea. If uacce module is not compiled, ERR_PTR(-ENODEV) 160*aa017ab9SKenneth Lee 161*aa017ab9SKenneth Leeb. Succeed with the desired flags 162*aa017ab9SKenneth Lee 163*aa017ab9SKenneth Leec. Succeed with the negotiated flags, for example 164*aa017ab9SKenneth Lee 165*aa017ab9SKenneth Lee uacce_interface.flags = UACCE_DEV_SVA but uacce->flags = ~UACCE_DEV_SVA 166*aa017ab9SKenneth Lee 167*aa017ab9SKenneth Lee So user driver need check return value as well as the negotiated uacce->flags. 168*aa017ab9SKenneth Lee 169*aa017ab9SKenneth Lee 170*aa017ab9SKenneth LeeThe user driver 171*aa017ab9SKenneth Lee--------------- 172*aa017ab9SKenneth Lee 173*aa017ab9SKenneth LeeThe queue file mmap space will need a user driver to wrap the communication 174*aa017ab9SKenneth Leeprotocol. Uacce provides some attributes in sysfs for the user driver to 175*aa017ab9SKenneth Leematch the right accelerator accordingly. 176*aa017ab9SKenneth LeeMore details in Documentation/ABI/testing/sysfs-driver-uacce. 177