1================================================= 2FPGA Device Feature List (DFL) Framework Overview 3================================================= 4 5Authors: 6 7- Enno Luebbers <enno.luebbers@intel.com> 8- Xiao Guangrong <guangrong.xiao@linux.intel.com> 9- Wu Hao <hao.wu@intel.com> 10 11The Device Feature List (DFL) FPGA framework (and drivers according to 12this framework) hides the very details of low layer hardwares and provides 13unified interfaces to userspace. Applications could use these interfaces to 14configure, enumerate, open and access FPGA accelerators on platforms which 15implement the DFL in the device memory. Besides this, the DFL framework 16enables system level management functions such as FPGA reconfiguration. 17 18 19Device Feature List (DFL) Overview 20================================== 21Device Feature List (DFL) defines a linked list of feature headers within the 22device MMIO space to provide an extensible way of adding features. Software can 23walk through these predefined data structures to enumerate FPGA features: 24FPGA Interface Unit (FIU), Accelerated Function Unit (AFU) and Private Features, 25as illustrated below:: 26 27 Header Header Header Header 28 +----------+ +-->+----------+ +-->+----------+ +-->+----------+ 29 | Type | | | Type | | | Type | | | Type | 30 | FIU | | | Private | | | Private | | | Private | 31 +----------+ | | Feature | | | Feature | | | Feature | 32 | Next_DFH |--+ +----------+ | +----------+ | +----------+ 33 +----------+ | Next_DFH |--+ | Next_DFH |--+ | Next_DFH |--> NULL 34 | ID | +----------+ +----------+ +----------+ 35 +----------+ | ID | | ID | | ID | 36 | Next_AFU |--+ +----------+ +----------+ +----------+ 37 +----------+ | | Feature | | Feature | | Feature | 38 | Header | | | Register | | Register | | Register | 39 | Register | | | Set | | Set | | Set | 40 | Set | | +----------+ +----------+ +----------+ 41 +----------+ | Header 42 +-->+----------+ 43 | Type | 44 | AFU | 45 +----------+ 46 | Next_DFH |--> NULL 47 +----------+ 48 | GUID | 49 +----------+ 50 | Header | 51 | Register | 52 | Set | 53 +----------+ 54 55FPGA Interface Unit (FIU) represents a standalone functional unit for the 56interface to FPGA, e.g. the FPGA Management Engine (FME) and Port (more 57descriptions on FME and Port in later sections). 58 59Accelerated Function Unit (AFU) represents a FPGA programmable region and 60always connects to a FIU (e.g. a Port) as its child as illustrated above. 61 62Private Features represent sub features of the FIU and AFU. They could be 63various function blocks with different IDs, but all private features which 64belong to the same FIU or AFU, must be linked to one list via the Next Device 65Feature Header (Next_DFH) pointer. 66 67Each FIU, AFU and Private Feature could implement its own functional registers. 68The functional register set for FIU and AFU, is named as Header Register Set, 69e.g. FME Header Register Set, and the one for Private Feature, is named as 70Feature Register Set, e.g. FME Partial Reconfiguration Feature Register Set. 71 72This Device Feature List provides a way of linking features together, it's 73convenient for software to locate each feature by walking through this list, 74and can be implemented in register regions of any FPGA device. 75 76 77FIU - FME (FPGA Management Engine) 78================================== 79The FPGA Management Engine performs reconfiguration and other infrastructure 80functions. Each FPGA device only has one FME. 81 82User-space applications can acquire exclusive access to the FME using open(), 83and release it using close(). 84 85The following functions are exposed through ioctls: 86 87- Get driver API version (DFL_FPGA_GET_API_VERSION) 88- Check for extensions (DFL_FPGA_CHECK_EXTENSION) 89- Program bitstream (DFL_FPGA_FME_PORT_PR) 90- Assign port to PF (DFL_FPGA_FME_PORT_ASSIGN) 91- Release port from PF (DFL_FPGA_FME_PORT_RELEASE) 92- Get number of irqs of FME global error (DFL_FPGA_FME_ERR_GET_IRQ_NUM) 93- Set interrupt trigger for FME error (DFL_FPGA_FME_ERR_SET_IRQ) 94 95More functions are exposed through sysfs 96(/sys/class/fpga_region/regionX/dfl-fme.n/): 97 98 Read bitstream ID (bitstream_id) 99 bitstream_id indicates version of the static FPGA region. 100 101 Read bitstream metadata (bitstream_metadata) 102 bitstream_metadata includes detailed information of static FPGA region, 103 e.g. synthesis date and seed. 104 105 Read number of ports (ports_num) 106 one FPGA device may have more than one port, this sysfs interface indicates 107 how many ports the FPGA device has. 108 109 Global error reporting management (errors/) 110 error reporting sysfs interfaces allow user to read errors detected by the 111 hardware, and clear the logged errors. 112 113 Power management (dfl_fme_power hwmon) 114 power management hwmon sysfs interfaces allow user to read power management 115 information (power consumption, thresholds, threshold status, limits, etc.) 116 and configure power thresholds for different throttling levels. 117 118 Thermal management (dfl_fme_thermal hwmon) 119 thermal management hwmon sysfs interfaces allow user to read thermal 120 management information (current temperature, thresholds, threshold status, 121 etc.). 122 123 Performance reporting 124 performance counters are exposed through perf PMU APIs. Standard perf tool 125 can be used to monitor all available perf events. Please see performance 126 counter section below for more detailed information. 127 128 129FIU - PORT 130========== 131A port represents the interface between the static FPGA fabric and a partially 132reconfigurable region containing an AFU. It controls the communication from SW 133to the accelerator and exposes features such as reset and debug. Each FPGA 134device may have more than one port, but always one AFU per port. 135 136 137AFU 138=== 139An AFU is attached to a port FIU and exposes a fixed length MMIO region to be 140used for accelerator-specific control registers. 141 142User-space applications can acquire exclusive access to an AFU attached to a 143port by using open() on the port device node and release it using close(). 144 145The following functions are exposed through ioctls: 146 147- Get driver API version (DFL_FPGA_GET_API_VERSION) 148- Check for extensions (DFL_FPGA_CHECK_EXTENSION) 149- Get port info (DFL_FPGA_PORT_GET_INFO) 150- Get MMIO region info (DFL_FPGA_PORT_GET_REGION_INFO) 151- Map DMA buffer (DFL_FPGA_PORT_DMA_MAP) 152- Unmap DMA buffer (DFL_FPGA_PORT_DMA_UNMAP) 153- Reset AFU (DFL_FPGA_PORT_RESET) 154- Get number of irqs of port error (DFL_FPGA_PORT_ERR_GET_IRQ_NUM) 155- Set interrupt trigger for port error (DFL_FPGA_PORT_ERR_SET_IRQ) 156- Get number of irqs of UINT (DFL_FPGA_PORT_UINT_GET_IRQ_NUM) 157- Set interrupt trigger for UINT (DFL_FPGA_PORT_UINT_SET_IRQ) 158 159DFL_FPGA_PORT_RESET: 160 reset the FPGA Port and its AFU. Userspace can do Port 161 reset at any time, e.g. during DMA or Partial Reconfiguration. But it should 162 never cause any system level issue, only functional failure (e.g. DMA or PR 163 operation failure) and be recoverable from the failure. 164 165User-space applications can also mmap() accelerator MMIO regions. 166 167More functions are exposed through sysfs: 168(/sys/class/fpga_region/<regionX>/<dfl-port.m>/): 169 170 Read Accelerator GUID (afu_id) 171 afu_id indicates which PR bitstream is programmed to this AFU. 172 173 Error reporting (errors/) 174 error reporting sysfs interfaces allow user to read port/afu errors 175 detected by the hardware, and clear the logged errors. 176 177 178DFL Framework Overview 179====================== 180 181:: 182 183 +----------+ +--------+ +--------+ +--------+ 184 | FME | | AFU | | AFU | | AFU | 185 | Module | | Module | | Module | | Module | 186 +----------+ +--------+ +--------+ +--------+ 187 +-----------------------+ 188 | FPGA Container Device | Device Feature List 189 | (FPGA Base Region) | Framework 190 +-----------------------+ 191 ------------------------------------------------------------------ 192 +----------------------------+ 193 | FPGA DFL Device Module | 194 | (e.g. PCIE/Platform Device)| 195 +----------------------------+ 196 +------------------------+ 197 | FPGA Hardware Device | 198 +------------------------+ 199 200DFL framework in kernel provides common interfaces to create container device 201(FPGA base region), discover feature devices and their private features from the 202given Device Feature Lists and create platform devices for feature devices 203(e.g. FME, Port and AFU) with related resources under the container device. It 204also abstracts operations for the private features and exposes common ops to 205feature device drivers. 206 207The FPGA DFL Device could be different hardwares, e.g. PCIe device, platform 208device and etc. Its driver module is always loaded first once the device is 209created by the system. This driver plays an infrastructural role in the 210driver architecture. It locates the DFLs in the device memory, handles them 211and related resources to common interfaces from DFL framework for enumeration. 212(Please refer to drivers/fpga/dfl.c for detailed enumeration APIs). 213 214The FPGA Management Engine (FME) driver is a platform driver which is loaded 215automatically after FME platform device creation from the DFL device module. It 216provides the key features for FPGA management, including: 217 218 a) Expose static FPGA region information, e.g. version and metadata. 219 Users can read related information via sysfs interfaces exposed 220 by FME driver. 221 222 b) Partial Reconfiguration. The FME driver creates FPGA manager, FPGA 223 bridges and FPGA regions during PR sub feature initialization. Once 224 it receives a DFL_FPGA_FME_PORT_PR ioctl from user, it invokes the 225 common interface function from FPGA Region to complete the partial 226 reconfiguration of the PR bitstream to the given port. 227 228Similar to the FME driver, the FPGA Accelerated Function Unit (AFU) driver is 229probed once the AFU platform device is created. The main function of this module 230is to provide an interface for userspace applications to access the individual 231accelerators, including basic reset control on port, AFU MMIO region export, dma 232buffer mapping service functions. 233 234After feature platform devices creation, matched platform drivers will be loaded 235automatically to handle different functionalities. Please refer to next sections 236for detailed information on functional units which have been already implemented 237under this DFL framework. 238 239 240Partial Reconfiguration 241======================= 242As mentioned above, accelerators can be reconfigured through partial 243reconfiguration of a PR bitstream file. The PR bitstream file must have been 244generated for the exact static FPGA region and targeted reconfigurable region 245(port) of the FPGA, otherwise, the reconfiguration operation will fail and 246possibly cause system instability. This compatibility can be checked by 247comparing the compatibility ID noted in the header of PR bitstream file against 248the compat_id exposed by the target FPGA region. This check is usually done by 249userspace before calling the reconfiguration IOCTL. 250 251 252FPGA virtualization - PCIe SRIOV 253================================ 254This section describes the virtualization support on DFL based FPGA device to 255enable accessing an accelerator from applications running in a virtual machine 256(VM). This section only describes the PCIe based FPGA device with SRIOV support. 257 258Features supported by the particular FPGA device are exposed through Device 259Feature Lists, as illustrated below: 260 261:: 262 263 +-------------------------------+ +-------------+ 264 | PF | | VF | 265 +-------------------------------+ +-------------+ 266 ^ ^ ^ ^ 267 | | | | 268 +-----|------------|---------|--------------|-------+ 269 | | | | | | 270 | +-----+ +-------+ +-------+ +-------+ | 271 | | FME | | Port0 | | Port1 | | Port2 | | 272 | +-----+ +-------+ +-------+ +-------+ | 273 | ^ ^ ^ | 274 | | | | | 275 | +-------+ +------+ +-------+ | 276 | | AFU | | AFU | | AFU | | 277 | +-------+ +------+ +-------+ | 278 | | 279 | DFL based FPGA PCIe Device | 280 +---------------------------------------------------+ 281 282FME is always accessed through the physical function (PF). 283 284Ports (and related AFUs) are accessed via PF by default, but could be exposed 285through virtual function (VF) devices via PCIe SRIOV. Each VF only contains 2861 Port and 1 AFU for isolation. Users could assign individual VFs (accelerators) 287created via PCIe SRIOV interface, to virtual machines. 288 289The driver organization in virtualization case is illustrated below: 290:: 291 292 +-------++------++------+ | 293 | FME || FME || FME | | 294 | FPGA || FPGA || FPGA | | 295 |Manager||Bridge||Region| | 296 +-------++------++------+ | 297 +-----------------------+ +--------+ | +--------+ 298 | FME | | AFU | | | AFU | 299 | Module | | Module | | | Module | 300 +-----------------------+ +--------+ | +--------+ 301 +-----------------------+ | +-----------------------+ 302 | FPGA Container Device | | | FPGA Container Device | 303 | (FPGA Base Region) | | | (FPGA Base Region) | 304 +-----------------------+ | +-----------------------+ 305 +------------------+ | +------------------+ 306 | FPGA PCIE Module | | Virtual | FPGA PCIE Module | 307 +------------------+ Host | Machine +------------------+ 308 -------------------------------------- | ------------------------------ 309 +---------------+ | +---------------+ 310 | PCI PF Device | | | PCI VF Device | 311 +---------------+ | +---------------+ 312 313FPGA PCIe device driver is always loaded first once a FPGA PCIe PF or VF device 314is detected. It: 315 316* Finishes enumeration on both FPGA PCIe PF and VF device using common 317 interfaces from DFL framework. 318* Supports SRIOV. 319 320The FME device driver plays a management role in this driver architecture, it 321provides ioctls to release Port from PF and assign Port to PF. After release 322a port from PF, then it's safe to expose this port through a VF via PCIe SRIOV 323sysfs interface. 324 325To enable accessing an accelerator from applications running in a VM, the 326respective AFU's port needs to be assigned to a VF using the following steps: 327 328#. The PF owns all AFU ports by default. Any port that needs to be 329 reassigned to a VF must first be released through the 330 DFL_FPGA_FME_PORT_RELEASE ioctl on the FME device. 331 332#. Once N ports are released from PF, then user can use command below 333 to enable SRIOV and VFs. Each VF owns only one Port with AFU. 334 335 :: 336 337 echo N > $PCI_DEVICE_PATH/sriov_numvfs 338 339#. Pass through the VFs to VMs 340 341#. The AFU under VF is accessible from applications in VM (using the 342 same driver inside the VF). 343 344Note that an FME can't be assigned to a VF, thus PR and other management 345functions are only available via the PF. 346 347Device enumeration 348================== 349This section introduces how applications enumerate the fpga device from 350the sysfs hierarchy under /sys/class/fpga_region. 351 352In the example below, two DFL based FPGA devices are installed in the host. Each 353fpga device has one FME and two ports (AFUs). 354 355FPGA regions are created under /sys/class/fpga_region/:: 356 357 /sys/class/fpga_region/region0 358 /sys/class/fpga_region/region1 359 /sys/class/fpga_region/region2 360 ... 361 362Application needs to search each regionX folder, if feature device is found, 363(e.g. "dfl-port.n" or "dfl-fme.m" is found), then it's the base 364fpga region which represents the FPGA device. 365 366Each base region has one FME and two ports (AFUs) as child devices:: 367 368 /sys/class/fpga_region/region0/dfl-fme.0 369 /sys/class/fpga_region/region0/dfl-port.0 370 /sys/class/fpga_region/region0/dfl-port.1 371 ... 372 373 /sys/class/fpga_region/region3/dfl-fme.1 374 /sys/class/fpga_region/region3/dfl-port.2 375 /sys/class/fpga_region/region3/dfl-port.3 376 ... 377 378In general, the FME/AFU sysfs interfaces are named as follows:: 379 380 /sys/class/fpga_region/<regionX>/<dfl-fme.n>/ 381 /sys/class/fpga_region/<regionX>/<dfl-port.m>/ 382 383with 'n' consecutively numbering all FMEs and 'm' consecutively numbering all 384ports. 385 386The device nodes used for ioctl() or mmap() can be referenced through:: 387 388 /sys/class/fpga_region/<regionX>/<dfl-fme.n>/dev 389 /sys/class/fpga_region/<regionX>/<dfl-port.n>/dev 390 391 392Performance Counters 393==================== 394Performance reporting is one private feature implemented in FME. It could 395supports several independent, system-wide, device counter sets in hardware to 396monitor and count for performance events, including "basic", "cache", "fabric", 397"vtd" and "vtd_sip" counters. Users could use standard perf tool to monitor 398FPGA cache hit/miss rate, transaction number, interface clock counter of AFU 399and other FPGA performance events. 400 401Different FPGA devices may have different counter sets, depending on hardware 402implementation. E.g., some discrete FPGA cards don't have any cache. User could 403use "perf list" to check which perf events are supported by target hardware. 404 405In order to allow user to use standard perf API to access these performance 406counters, driver creates a perf PMU, and related sysfs interfaces in 407/sys/bus/event_source/devices/dfl_fme* to describe available perf events and 408configuration options. 409 410The "format" directory describes the format of the config field of struct 411perf_event_attr. There are 3 bitfields for config: "evtype" defines which type 412the perf event belongs to; "event" is the identity of the event within its 413category; "portid" is introduced to decide counters set to monitor on FPGA 414overall data or a specific port. 415 416The "events" directory describes the configuration templates for all available 417events which can be used with perf tool directly. For example, fab_mmio_read 418has the configuration "event=0x06,evtype=0x02,portid=0xff", which shows this 419event belongs to fabric type (0x02), the local event id is 0x06 and it is for 420overall monitoring (portid=0xff). 421 422Example usage of perf:: 423 424 $# perf list |grep dfl_fme 425 426 dfl_fme0/fab_mmio_read/ [Kernel PMU event] 427 <...> 428 dfl_fme0/fab_port_mmio_read,portid=?/ [Kernel PMU event] 429 <...> 430 431 $# perf stat -a -e dfl_fme0/fab_mmio_read/ <command> 432 or 433 $# perf stat -a -e dfl_fme0/event=0x06,evtype=0x02,portid=0xff/ <command> 434 or 435 $# perf stat -a -e dfl_fme0/config=0xff2006/ <command> 436 437Another example, fab_port_mmio_read monitors mmio read of a specific port. So 438its configuration template is "event=0x06,evtype=0x01,portid=?". The portid 439should be explicitly set. 440 441Its usage of perf:: 442 443 $# perf stat -a -e dfl_fme0/fab_port_mmio_read,portid=0x0/ <command> 444 or 445 $# perf stat -a -e dfl_fme0/event=0x06,evtype=0x02,portid=0x0/ <command> 446 or 447 $# perf stat -a -e dfl_fme0/config=0x2006/ <command> 448 449Please note for fabric counters, overall perf events (fab_*) and port perf 450events (fab_port_*) actually share one set of counters in hardware, so it can't 451monitor both at the same time. If this set of counters is configured to monitor 452overall data, then per port perf data is not supported. See below example:: 453 454 $# perf stat -e dfl_fme0/fab_mmio_read/,dfl_fme0/fab_port_mmio_write,\ 455 portid=0/ sleep 1 456 457 Performance counter stats for 'system wide': 458 459 3 dfl_fme0/fab_mmio_read/ 460 <not supported> dfl_fme0/fab_port_mmio_write,portid=0x0/ 461 462 1.001750904 seconds time elapsed 463 464The driver also provides a "cpumask" sysfs attribute, which contains only one 465CPU id used to access these perf events. Counting on multiple CPU is not allowed 466since they are system-wide counters on FPGA device. 467 468The current driver does not support sampling. So "perf record" is unsupported. 469 470 471Interrupt support 472================= 473Some FME and AFU private features are able to generate interrupts. As mentioned 474above, users could call ioctl (DFL_FPGA_*_GET_IRQ_NUM) to know whether or how 475many interrupts are supported for this private feature. Drivers also implement 476an eventfd based interrupt handling mechanism for users to get notified when 477interrupt happens. Users could set eventfds to driver via 478ioctl (DFL_FPGA_*_SET_IRQ), and then poll/select on these eventfds waiting for 479notification. 480In Current DFL, 3 sub features (Port error, FME global error and AFU interrupt) 481support interrupts. 482 483 484Add new FIUs support 485==================== 486It's possible that developers made some new function blocks (FIUs) under this 487DFL framework, then new platform device driver needs to be developed for the 488new feature dev (FIU) following the same way as existing feature dev drivers 489(e.g. FME and Port/AFU platform device driver). Besides that, it requires 490modification on DFL framework enumeration code too, for new FIU type detection 491and related platform devices creation. 492 493 494Add new private features support 495================================ 496In some cases, we may need to add some new private features to existing FIUs 497(e.g. FME or Port). Developers don't need to touch enumeration code in DFL 498framework, as each private feature will be parsed automatically and related 499mmio resources can be found under FIU platform device created by DFL framework. 500Developer only needs to provide a sub feature driver with matched feature id. 501FME Partial Reconfiguration Sub Feature driver (see drivers/fpga/dfl-fme-pr.c) 502could be a reference. 503 504Location of DFLs on a PCI Device 505================================ 506The original method for finding a DFL on a PCI device assumed the start of the 507first DFL to offset 0 of bar 0. If the first node of the DFL is an FME, 508then further DFLs in the port(s) are specified in FME header registers. 509Alternatively, a PCIe vendor specific capability structure can be used to 510specify the location of all the DFLs on the device, providing flexibility 511for the type of starting node in the DFL. Intel has reserved the 512VSEC ID of 0x43 for this purpose. The vendor specific 513data begins with a 4 byte vendor specific register for the number of DFLs followed 4 byte 514Offset/BIR vendor specific registers for each DFL. Bits 2:0 of Offset/BIR register 515indicates the BAR, and bits 31:3 form the 8 byte aligned offset where bits 2:0 are 516zero. 517:: 518 519 +----------------------------+ 520 |31 Number of DFLS 0| 521 +----------------------------+ 522 |31 Offset 3|2 BIR 0| 523 +----------------------------+ 524 . . . 525 +----------------------------+ 526 |31 Offset 3|2 BIR 0| 527 +----------------------------+ 528 529Being able to specify more than one DFL per BAR has been considered, but it 530was determined the use case did not provide value. Specifying a single DFL 531per BAR simplifies the implementation and allows for extra error checking. 532 533Open discussion 534=============== 535FME driver exports one ioctl (DFL_FPGA_FME_PORT_PR) for partial reconfiguration 536to user now. In the future, if unified user interfaces for reconfiguration are 537added, FME driver should switch to them from ioctl interface. 538