1================================================= 2FPGA Device Feature List (DFL) Framework Overview 3================================================= 4 5Authors: 6 7- Enno Luebbers <enno.luebbers@intel.com> 8- Xiao Guangrong <guangrong.xiao@linux.intel.com> 9- Wu Hao <hao.wu@intel.com> 10- Xu Yilun <yilun.xu@intel.com> 11 12The Device Feature List (DFL) FPGA framework (and drivers according to 13this framework) hides the very details of low layer hardware and provides 14unified interfaces to userspace. Applications could use these interfaces to 15configure, enumerate, open and access FPGA accelerators on platforms which 16implement the DFL in the device memory. Besides this, the DFL framework 17enables system level management functions such as FPGA reconfiguration. 18 19 20Device Feature List (DFL) Overview 21================================== 22Device Feature List (DFL) defines a linked list of feature headers within the 23device MMIO space to provide an extensible way of adding features. Software can 24walk through these predefined data structures to enumerate FPGA features: 25FPGA Interface Unit (FIU), Accelerated Function Unit (AFU) and Private Features, 26as illustrated below:: 27 28 Header Header Header Header 29 +----------+ +-->+----------+ +-->+----------+ +-->+----------+ 30 | Type | | | Type | | | Type | | | Type | 31 | FIU | | | Private | | | Private | | | Private | 32 +----------+ | | Feature | | | Feature | | | Feature | 33 | Next_DFH |--+ +----------+ | +----------+ | +----------+ 34 +----------+ | Next_DFH |--+ | Next_DFH |--+ | Next_DFH |--> NULL 35 | ID | +----------+ +----------+ +----------+ 36 +----------+ | ID | | ID | | ID | 37 | Next_AFU |--+ +----------+ +----------+ +----------+ 38 +----------+ | | Feature | | Feature | | Feature | 39 | Header | | | Register | | Register | | Register | 40 | Register | | | Set | | Set | | Set | 41 | Set | | +----------+ +----------+ +----------+ 42 +----------+ | Header 43 +-->+----------+ 44 | Type | 45 | AFU | 46 +----------+ 47 | Next_DFH |--> NULL 48 +----------+ 49 | GUID | 50 +----------+ 51 | Header | 52 | Register | 53 | Set | 54 +----------+ 55 56FPGA Interface Unit (FIU) represents a standalone functional unit for the 57interface to FPGA, e.g. the FPGA Management Engine (FME) and Port (more 58descriptions on FME and Port in later sections). 59 60Accelerated Function Unit (AFU) represents an FPGA programmable region and 61always connects to a FIU (e.g. a Port) as its child as illustrated above. 62 63Private Features represent sub features of the FIU and AFU. They could be 64various function blocks with different IDs, but all private features which 65belong to the same FIU or AFU, must be linked to one list via the Next Device 66Feature Header (Next_DFH) pointer. 67 68Each FIU, AFU and Private Feature could implement its own functional registers. 69The functional register set for FIU and AFU, is named as Header Register Set, 70e.g. FME Header Register Set, and the one for Private Feature, is named as 71Feature Register Set, e.g. FME Partial Reconfiguration Feature Register Set. 72 73This Device Feature List provides a way of linking features together, it's 74convenient for software to locate each feature by walking through this list, 75and can be implemented in register regions of any FPGA device. 76 77 78FIU - FME (FPGA Management Engine) 79================================== 80The FPGA Management Engine performs reconfiguration and other infrastructure 81functions. Each FPGA device only has one FME. 82 83User-space applications can acquire exclusive access to the FME using open(), 84and release it using close(). 85 86The following functions are exposed through ioctls: 87 88- Get driver API version (DFL_FPGA_GET_API_VERSION) 89- Check for extensions (DFL_FPGA_CHECK_EXTENSION) 90- Program bitstream (DFL_FPGA_FME_PORT_PR) 91- Assign port to PF (DFL_FPGA_FME_PORT_ASSIGN) 92- Release port from PF (DFL_FPGA_FME_PORT_RELEASE) 93- Get number of irqs of FME global error (DFL_FPGA_FME_ERR_GET_IRQ_NUM) 94- Set interrupt trigger for FME error (DFL_FPGA_FME_ERR_SET_IRQ) 95 96More functions are exposed through sysfs 97(/sys/class/fpga_region/regionX/dfl-fme.n/): 98 99 Read bitstream ID (bitstream_id) 100 bitstream_id indicates version of the static FPGA region. 101 102 Read bitstream metadata (bitstream_metadata) 103 bitstream_metadata includes detailed information of static FPGA region, 104 e.g. synthesis date and seed. 105 106 Read number of ports (ports_num) 107 one FPGA device may have more than one port, this sysfs interface indicates 108 how many ports the FPGA device has. 109 110 Global error reporting management (errors/) 111 error reporting sysfs interfaces allow user to read errors detected by the 112 hardware, and clear the logged errors. 113 114 Power management (dfl_fme_power hwmon) 115 power management hwmon sysfs interfaces allow user to read power management 116 information (power consumption, thresholds, threshold status, limits, etc.) 117 and configure power thresholds for different throttling levels. 118 119 Thermal management (dfl_fme_thermal hwmon) 120 thermal management hwmon sysfs interfaces allow user to read thermal 121 management information (current temperature, thresholds, threshold status, 122 etc.). 123 124 Performance reporting 125 performance counters are exposed through perf PMU APIs. Standard perf tool 126 can be used to monitor all available perf events. Please see performance 127 counter section below for more detailed information. 128 129 130FIU - PORT 131========== 132A port represents the interface between the static FPGA fabric and a partially 133reconfigurable region containing an AFU. It controls the communication from SW 134to the accelerator and exposes features such as reset and debug. Each FPGA 135device may have more than one port, but always one AFU per port. 136 137 138AFU 139=== 140An AFU is attached to a port FIU and exposes a fixed length MMIO region to be 141used for accelerator-specific control registers. 142 143User-space applications can acquire exclusive access to an AFU attached to a 144port by using open() on the port device node and release it using close(). 145 146The following functions are exposed through ioctls: 147 148- Get driver API version (DFL_FPGA_GET_API_VERSION) 149- Check for extensions (DFL_FPGA_CHECK_EXTENSION) 150- Get port info (DFL_FPGA_PORT_GET_INFO) 151- Get MMIO region info (DFL_FPGA_PORT_GET_REGION_INFO) 152- Map DMA buffer (DFL_FPGA_PORT_DMA_MAP) 153- Unmap DMA buffer (DFL_FPGA_PORT_DMA_UNMAP) 154- Reset AFU (DFL_FPGA_PORT_RESET) 155- Get number of irqs of port error (DFL_FPGA_PORT_ERR_GET_IRQ_NUM) 156- Set interrupt trigger for port error (DFL_FPGA_PORT_ERR_SET_IRQ) 157- Get number of irqs of UINT (DFL_FPGA_PORT_UINT_GET_IRQ_NUM) 158- Set interrupt trigger for UINT (DFL_FPGA_PORT_UINT_SET_IRQ) 159 160DFL_FPGA_PORT_RESET: 161 reset the FPGA Port and its AFU. Userspace can do Port 162 reset at any time, e.g. during DMA or Partial Reconfiguration. But it should 163 never cause any system level issue, only functional failure (e.g. DMA or PR 164 operation failure) and be recoverable from the failure. 165 166User-space applications can also mmap() accelerator MMIO regions. 167 168More functions are exposed through sysfs: 169(/sys/class/fpga_region/<regionX>/<dfl-port.m>/): 170 171 Read Accelerator GUID (afu_id) 172 afu_id indicates which PR bitstream is programmed to this AFU. 173 174 Error reporting (errors/) 175 error reporting sysfs interfaces allow user to read port/afu errors 176 detected by the hardware, and clear the logged errors. 177 178 179DFL Framework Overview 180====================== 181 182:: 183 184 +----------+ +--------+ +--------+ +--------+ 185 | FME | | AFU | | AFU | | AFU | 186 | Module | | Module | | Module | | Module | 187 +----------+ +--------+ +--------+ +--------+ 188 +-----------------------+ 189 | FPGA Container Device | Device Feature List 190 | (FPGA Base Region) | Framework 191 +-----------------------+ 192 ------------------------------------------------------------------ 193 +----------------------------+ 194 | FPGA DFL Device Module | 195 | (e.g. PCIE/Platform Device)| 196 +----------------------------+ 197 +------------------------+ 198 | FPGA Hardware Device | 199 +------------------------+ 200 201DFL framework in kernel provides common interfaces to create container device 202(FPGA base region), discover feature devices and their private features from the 203given Device Feature Lists and create platform devices for feature devices 204(e.g. FME, Port and AFU) with related resources under the container device. It 205also abstracts operations for the private features and exposes common ops to 206feature device drivers. 207 208The FPGA DFL Device could be different hardware, e.g. PCIe device, platform 209device and etc. Its driver module is always loaded first once the device is 210created by the system. This driver plays an infrastructural role in the 211driver architecture. It locates the DFLs in the device memory, handles them 212and related resources to common interfaces from DFL framework for enumeration. 213(Please refer to drivers/fpga/dfl.c for detailed enumeration APIs). 214 215The FPGA Management Engine (FME) driver is a platform driver which is loaded 216automatically after FME platform device creation from the DFL device module. It 217provides the key features for FPGA management, including: 218 219 a) Expose static FPGA region information, e.g. version and metadata. 220 Users can read related information via sysfs interfaces exposed 221 by FME driver. 222 223 b) Partial Reconfiguration. The FME driver creates FPGA manager, FPGA 224 bridges and FPGA regions during PR sub feature initialization. Once 225 it receives a DFL_FPGA_FME_PORT_PR ioctl from user, it invokes the 226 common interface function from FPGA Region to complete the partial 227 reconfiguration of the PR bitstream to the given port. 228 229Similar to the FME driver, the FPGA Accelerated Function Unit (AFU) driver is 230probed once the AFU platform device is created. The main function of this module 231is to provide an interface for userspace applications to access the individual 232accelerators, including basic reset control on port, AFU MMIO region export, dma 233buffer mapping service functions. 234 235After feature platform devices creation, matched platform drivers will be loaded 236automatically to handle different functionalities. Please refer to next sections 237for detailed information on functional units which have been already implemented 238under this DFL framework. 239 240 241Partial Reconfiguration 242======================= 243As mentioned above, accelerators can be reconfigured through partial 244reconfiguration of a PR bitstream file. The PR bitstream file must have been 245generated for the exact static FPGA region and targeted reconfigurable region 246(port) of the FPGA, otherwise, the reconfiguration operation will fail and 247possibly cause system instability. This compatibility can be checked by 248comparing the compatibility ID noted in the header of PR bitstream file against 249the compat_id exposed by the target FPGA region. This check is usually done by 250userspace before calling the reconfiguration IOCTL. 251 252 253FPGA virtualization - PCIe SRIOV 254================================ 255This section describes the virtualization support on DFL based FPGA device to 256enable accessing an accelerator from applications running in a virtual machine 257(VM). This section only describes the PCIe based FPGA device with SRIOV support. 258 259Features supported by the particular FPGA device are exposed through Device 260Feature Lists, as illustrated below: 261 262:: 263 264 +-------------------------------+ +-------------+ 265 | PF | | VF | 266 +-------------------------------+ +-------------+ 267 ^ ^ ^ ^ 268 | | | | 269 +-----|------------|---------|--------------|-------+ 270 | | | | | | 271 | +-----+ +-------+ +-------+ +-------+ | 272 | | FME | | Port0 | | Port1 | | Port2 | | 273 | +-----+ +-------+ +-------+ +-------+ | 274 | ^ ^ ^ | 275 | | | | | 276 | +-------+ +------+ +-------+ | 277 | | AFU | | AFU | | AFU | | 278 | +-------+ +------+ +-------+ | 279 | | 280 | DFL based FPGA PCIe Device | 281 +---------------------------------------------------+ 282 283FME is always accessed through the physical function (PF). 284 285Ports (and related AFUs) are accessed via PF by default, but could be exposed 286through virtual function (VF) devices via PCIe SRIOV. Each VF only contains 2871 Port and 1 AFU for isolation. Users could assign individual VFs (accelerators) 288created via PCIe SRIOV interface, to virtual machines. 289 290The driver organization in virtualization case is illustrated below: 291:: 292 293 +-------++------++------+ | 294 | FME || FME || FME | | 295 | FPGA || FPGA || FPGA | | 296 |Manager||Bridge||Region| | 297 +-------++------++------+ | 298 +-----------------------+ +--------+ | +--------+ 299 | FME | | AFU | | | AFU | 300 | Module | | Module | | | Module | 301 +-----------------------+ +--------+ | +--------+ 302 +-----------------------+ | +-----------------------+ 303 | FPGA Container Device | | | FPGA Container Device | 304 | (FPGA Base Region) | | | (FPGA Base Region) | 305 +-----------------------+ | +-----------------------+ 306 +------------------+ | +------------------+ 307 | FPGA PCIE Module | | Virtual | FPGA PCIE Module | 308 +------------------+ Host | Machine +------------------+ 309 -------------------------------------- | ------------------------------ 310 +---------------+ | +---------------+ 311 | PCI PF Device | | | PCI VF Device | 312 +---------------+ | +---------------+ 313 314FPGA PCIe device driver is always loaded first once an FPGA PCIe PF or VF device 315is detected. It: 316 317* Finishes enumeration on both FPGA PCIe PF and VF device using common 318 interfaces from DFL framework. 319* Supports SRIOV. 320 321The FME device driver plays a management role in this driver architecture, it 322provides ioctls to release Port from PF and assign Port to PF. After release 323a port from PF, then it's safe to expose this port through a VF via PCIe SRIOV 324sysfs interface. 325 326To enable accessing an accelerator from applications running in a VM, the 327respective AFU's port needs to be assigned to a VF using the following steps: 328 329#. The PF owns all AFU ports by default. Any port that needs to be 330 reassigned to a VF must first be released through the 331 DFL_FPGA_FME_PORT_RELEASE ioctl on the FME device. 332 333#. Once N ports are released from PF, then user can use command below 334 to enable SRIOV and VFs. Each VF owns only one Port with AFU. 335 336 :: 337 338 echo N > $PCI_DEVICE_PATH/sriov_numvfs 339 340#. Pass through the VFs to VMs 341 342#. The AFU under VF is accessible from applications in VM (using the 343 same driver inside the VF). 344 345Note that an FME can't be assigned to a VF, thus PR and other management 346functions are only available via the PF. 347 348Device enumeration 349================== 350This section introduces how applications enumerate the fpga device from 351the sysfs hierarchy under /sys/class/fpga_region. 352 353In the example below, two DFL based FPGA devices are installed in the host. Each 354fpga device has one FME and two ports (AFUs). 355 356FPGA regions are created under /sys/class/fpga_region/:: 357 358 /sys/class/fpga_region/region0 359 /sys/class/fpga_region/region1 360 /sys/class/fpga_region/region2 361 ... 362 363Application needs to search each regionX folder, if feature device is found, 364(e.g. "dfl-port.n" or "dfl-fme.m" is found), then it's the base 365fpga region which represents the FPGA device. 366 367Each base region has one FME and two ports (AFUs) as child devices:: 368 369 /sys/class/fpga_region/region0/dfl-fme.0 370 /sys/class/fpga_region/region0/dfl-port.0 371 /sys/class/fpga_region/region0/dfl-port.1 372 ... 373 374 /sys/class/fpga_region/region3/dfl-fme.1 375 /sys/class/fpga_region/region3/dfl-port.2 376 /sys/class/fpga_region/region3/dfl-port.3 377 ... 378 379In general, the FME/AFU sysfs interfaces are named as follows:: 380 381 /sys/class/fpga_region/<regionX>/<dfl-fme.n>/ 382 /sys/class/fpga_region/<regionX>/<dfl-port.m>/ 383 384with 'n' consecutively numbering all FMEs and 'm' consecutively numbering all 385ports. 386 387The device nodes used for ioctl() or mmap() can be referenced through:: 388 389 /sys/class/fpga_region/<regionX>/<dfl-fme.n>/dev 390 /sys/class/fpga_region/<regionX>/<dfl-port.n>/dev 391 392 393Performance Counters 394==================== 395Performance reporting is one private feature implemented in FME. It could 396supports several independent, system-wide, device counter sets in hardware to 397monitor and count for performance events, including "basic", "cache", "fabric", 398"vtd" and "vtd_sip" counters. Users could use standard perf tool to monitor 399FPGA cache hit/miss rate, transaction number, interface clock counter of AFU 400and other FPGA performance events. 401 402Different FPGA devices may have different counter sets, depending on hardware 403implementation. E.g., some discrete FPGA cards don't have any cache. User could 404use "perf list" to check which perf events are supported by target hardware. 405 406In order to allow user to use standard perf API to access these performance 407counters, driver creates a perf PMU, and related sysfs interfaces in 408/sys/bus/event_source/devices/dfl_fme* to describe available perf events and 409configuration options. 410 411The "format" directory describes the format of the config field of struct 412perf_event_attr. There are 3 bitfields for config: "evtype" defines which type 413the perf event belongs to; "event" is the identity of the event within its 414category; "portid" is introduced to decide counters set to monitor on FPGA 415overall data or a specific port. 416 417The "events" directory describes the configuration templates for all available 418events which can be used with perf tool directly. For example, fab_mmio_read 419has the configuration "event=0x06,evtype=0x02,portid=0xff", which shows this 420event belongs to fabric type (0x02), the local event id is 0x06 and it is for 421overall monitoring (portid=0xff). 422 423Example usage of perf:: 424 425 $# perf list |grep dfl_fme 426 427 dfl_fme0/fab_mmio_read/ [Kernel PMU event] 428 <...> 429 dfl_fme0/fab_port_mmio_read,portid=?/ [Kernel PMU event] 430 <...> 431 432 $# perf stat -a -e dfl_fme0/fab_mmio_read/ <command> 433 or 434 $# perf stat -a -e dfl_fme0/event=0x06,evtype=0x02,portid=0xff/ <command> 435 or 436 $# perf stat -a -e dfl_fme0/config=0xff2006/ <command> 437 438Another example, fab_port_mmio_read monitors mmio read of a specific port. So 439its configuration template is "event=0x06,evtype=0x01,portid=?". The portid 440should be explicitly set. 441 442Its usage of perf:: 443 444 $# perf stat -a -e dfl_fme0/fab_port_mmio_read,portid=0x0/ <command> 445 or 446 $# perf stat -a -e dfl_fme0/event=0x06,evtype=0x02,portid=0x0/ <command> 447 or 448 $# perf stat -a -e dfl_fme0/config=0x2006/ <command> 449 450Please note for fabric counters, overall perf events (fab_*) and port perf 451events (fab_port_*) actually share one set of counters in hardware, so it can't 452monitor both at the same time. If this set of counters is configured to monitor 453overall data, then per port perf data is not supported. See below example:: 454 455 $# perf stat -e dfl_fme0/fab_mmio_read/,dfl_fme0/fab_port_mmio_write,\ 456 portid=0/ sleep 1 457 458 Performance counter stats for 'system wide': 459 460 3 dfl_fme0/fab_mmio_read/ 461 <not supported> dfl_fme0/fab_port_mmio_write,portid=0x0/ 462 463 1.001750904 seconds time elapsed 464 465The driver also provides a "cpumask" sysfs attribute, which contains only one 466CPU id used to access these perf events. Counting on multiple CPU is not allowed 467since they are system-wide counters on FPGA device. 468 469The current driver does not support sampling. So "perf record" is unsupported. 470 471 472Interrupt support 473================= 474Some FME and AFU private features are able to generate interrupts. As mentioned 475above, users could call ioctl (DFL_FPGA_*_GET_IRQ_NUM) to know whether or how 476many interrupts are supported for this private feature. Drivers also implement 477an eventfd based interrupt handling mechanism for users to get notified when 478interrupt happens. Users could set eventfds to driver via 479ioctl (DFL_FPGA_*_SET_IRQ), and then poll/select on these eventfds waiting for 480notification. 481In Current DFL, 3 sub features (Port error, FME global error and AFU interrupt) 482support interrupts. 483 484 485Add new FIUs support 486==================== 487It's possible that developers made some new function blocks (FIUs) under this 488DFL framework, then new platform device driver needs to be developed for the 489new feature dev (FIU) following the same way as existing feature dev drivers 490(e.g. FME and Port/AFU platform device driver). Besides that, it requires 491modification on DFL framework enumeration code too, for new FIU type detection 492and related platform devices creation. 493 494 495Add new private features support 496================================ 497In some cases, we may need to add some new private features to existing FIUs 498(e.g. FME or Port). Developers don't need to touch enumeration code in DFL 499framework, as each private feature will be parsed automatically and related 500mmio resources can be found under FIU platform device created by DFL framework. 501Developer only needs to provide a sub feature driver with matched feature id. 502FME Partial Reconfiguration Sub Feature driver (see drivers/fpga/dfl-fme-pr.c) 503could be a reference. 504 505Please refer to below link to existing feature id table and guide for new feature 506ids application. 507https://github.com/OPAE/dfl-feature-id 508 509 510Location of DFLs on a PCI Device 511================================ 512The original method for finding a DFL on a PCI device assumed the start of the 513first DFL to offset 0 of bar 0. If the first node of the DFL is an FME, 514then further DFLs in the port(s) are specified in FME header registers. 515Alternatively, a PCIe vendor specific capability structure can be used to 516specify the location of all the DFLs on the device, providing flexibility 517for the type of starting node in the DFL. Intel has reserved the 518VSEC ID of 0x43 for this purpose. The vendor specific 519data begins with a 4 byte vendor specific register for the number of DFLs followed 4 byte 520Offset/BIR vendor specific registers for each DFL. Bits 2:0 of Offset/BIR register 521indicates the BAR, and bits 31:3 form the 8 byte aligned offset where bits 2:0 are 522zero. 523:: 524 525 +----------------------------+ 526 |31 Number of DFLS 0| 527 +----------------------------+ 528 |31 Offset 3|2 BIR 0| 529 +----------------------------+ 530 . . . 531 +----------------------------+ 532 |31 Offset 3|2 BIR 0| 533 +----------------------------+ 534 535Being able to specify more than one DFL per BAR has been considered, but it 536was determined the use case did not provide value. Specifying a single DFL 537per BAR simplifies the implementation and allows for extra error checking. 538 539 540Userspace driver support for DFL devices 541======================================== 542The purpose of an FPGA is to be reprogrammed with newly developed hardware 543components. New hardware can instantiate a new private feature in the DFL, and 544then present a DFL device in the system. In some cases users may need a 545userspace driver for the DFL device: 546 547* Users may need to run some diagnostic test for their hardware. 548* Users may prototype the kernel driver in user space. 549* Some hardware is designed for specific purposes and does not fit into one of 550 the standard kernel subsystems. 551 552This requires direct access to MMIO space and interrupt handling from 553userspace. The uio_dfl module exposes the UIO device interfaces for this 554purpose. 555 556Currently the uio_dfl driver only supports the Ether Group sub feature, which 557has no irq in hardware. So the interrupt handling is not added in this driver. 558 559UIO_DFL should be selected to enable the uio_dfl module driver. To support a 560new DFL feature via UIO direct access, its feature id should be added to the 561driver's id_table. 562 563 564Open discussion 565=============== 566FME driver exports one ioctl (DFL_FPGA_FME_PORT_PR) for partial reconfiguration 567to user now. In the future, if unified user interfaces for reconfiguration are 568added, FME driver should switch to them from ioctl interface. 569