14cdadfd5SDan Williams.. SPDX-License-Identifier: GPL-2.0 24cdadfd5SDan Williams.. include:: <isonum.txt> 34cdadfd5SDan Williams 44cdadfd5SDan Williams=================================== 54cdadfd5SDan WilliamsCompute Express Link Memory Devices 64cdadfd5SDan Williams=================================== 74cdadfd5SDan Williams 84cdadfd5SDan WilliamsA Compute Express Link Memory Device is a CXL component that implements the 94cdadfd5SDan WilliamsCXL.mem protocol. It contains some amount of volatile memory, persistent memory, 104cdadfd5SDan Williamsor both. It is enumerated as a PCI device for configuration and passing 114cdadfd5SDan Williamsmessages over an MMIO mailbox. Its contribution to the System Physical 124cdadfd5SDan WilliamsAddress space is handled via HDM (Host Managed Device Memory) decoders 134cdadfd5SDan Williamsthat optionally define a device's contribution to an interleaved address 144cdadfd5SDan Williamsrange across multiple devices underneath a host-bridge or interleaved 154cdadfd5SDan Williamsacross host-bridges. 168adaf747SBen Widawsky 17*54cdbf84SBen WidawskyCXL Bus: Theory of Operation 18*54cdbf84SBen Widawsky============================ 19*54cdbf84SBen WidawskySimilar to how a RAID driver takes disk objects and assembles them into a new 20*54cdbf84SBen Widawskylogical device, the CXL subsystem is tasked to take PCIe and ACPI objects and 21*54cdbf84SBen Widawskyassemble them into a CXL.mem decode topology. The need for runtime configuration 22*54cdbf84SBen Widawskyof the CXL.mem topology is also similar to RAID in that different environments 23*54cdbf84SBen Widawskywith the same hardware configuration may decide to assemble the topology in 24*54cdbf84SBen Widawskycontrasting ways. One may choose performance (RAID0) striping memory across 25*54cdbf84SBen Widawskymultiple Host Bridges and endpoints while another may opt for fault tolerance 26*54cdbf84SBen Widawskyand disable any striping in the CXL.mem topology. 27*54cdbf84SBen Widawsky 28*54cdbf84SBen WidawskyPlatform firmware enumerates a menu of interleave options at the "CXL root port" 29*54cdbf84SBen Widawsky(Linux term for the top of the CXL decode topology). From there, PCIe topology 30*54cdbf84SBen Widawskydictates which endpoints can participate in which Host Bridge decode regimes. 31*54cdbf84SBen WidawskyEach PCIe Switch in the path between the root and an endpoint introduces a point 32*54cdbf84SBen Widawskyat which the interleave can be split. For example platform firmware may say at a 33*54cdbf84SBen Widawskygiven range only decodes to 1 one Host Bridge, but that Host Bridge may in turn 34*54cdbf84SBen Widawskyinterleave cycles across multiple Root Ports. An intervening Switch between a 35*54cdbf84SBen Widawskyport and an endpoint may interleave cycles across multiple Downstream Switch 36*54cdbf84SBen WidawskyPorts, etc. 37*54cdbf84SBen Widawsky 38*54cdbf84SBen WidawskyHere is a sample listing of a CXL topology defined by 'cxl_test'. The 'cxl_test' 39*54cdbf84SBen Widawskymodule generates an emulated CXL topology of 2 Host Bridges each with 2 Root 40*54cdbf84SBen WidawskyPorts. Each of those Root Ports are connected to 2-way switches with endpoints 41*54cdbf84SBen Widawskyconnected to those downstream ports for a total of 8 endpoints:: 42*54cdbf84SBen Widawsky 43*54cdbf84SBen Widawsky # cxl list -BEMPu -b cxl_test 44*54cdbf84SBen Widawsky { 45*54cdbf84SBen Widawsky "bus":"root3", 46*54cdbf84SBen Widawsky "provider":"cxl_test", 47*54cdbf84SBen Widawsky "ports:root3":[ 48*54cdbf84SBen Widawsky { 49*54cdbf84SBen Widawsky "port":"port5", 50*54cdbf84SBen Widawsky "host":"cxl_host_bridge.1", 51*54cdbf84SBen Widawsky "ports:port5":[ 52*54cdbf84SBen Widawsky { 53*54cdbf84SBen Widawsky "port":"port8", 54*54cdbf84SBen Widawsky "host":"cxl_switch_uport.1", 55*54cdbf84SBen Widawsky "endpoints:port8":[ 56*54cdbf84SBen Widawsky { 57*54cdbf84SBen Widawsky "endpoint":"endpoint9", 58*54cdbf84SBen Widawsky "host":"mem2", 59*54cdbf84SBen Widawsky "memdev":{ 60*54cdbf84SBen Widawsky "memdev":"mem2", 61*54cdbf84SBen Widawsky "pmem_size":"256.00 MiB (268.44 MB)", 62*54cdbf84SBen Widawsky "ram_size":"256.00 MiB (268.44 MB)", 63*54cdbf84SBen Widawsky "serial":"0x1", 64*54cdbf84SBen Widawsky "numa_node":1, 65*54cdbf84SBen Widawsky "host":"cxl_mem.1" 66*54cdbf84SBen Widawsky } 67*54cdbf84SBen Widawsky }, 68*54cdbf84SBen Widawsky { 69*54cdbf84SBen Widawsky "endpoint":"endpoint15", 70*54cdbf84SBen Widawsky "host":"mem6", 71*54cdbf84SBen Widawsky "memdev":{ 72*54cdbf84SBen Widawsky "memdev":"mem6", 73*54cdbf84SBen Widawsky "pmem_size":"256.00 MiB (268.44 MB)", 74*54cdbf84SBen Widawsky "ram_size":"256.00 MiB (268.44 MB)", 75*54cdbf84SBen Widawsky "serial":"0x5", 76*54cdbf84SBen Widawsky "numa_node":1, 77*54cdbf84SBen Widawsky "host":"cxl_mem.5" 78*54cdbf84SBen Widawsky } 79*54cdbf84SBen Widawsky } 80*54cdbf84SBen Widawsky ] 81*54cdbf84SBen Widawsky }, 82*54cdbf84SBen Widawsky { 83*54cdbf84SBen Widawsky "port":"port12", 84*54cdbf84SBen Widawsky "host":"cxl_switch_uport.3", 85*54cdbf84SBen Widawsky "endpoints:port12":[ 86*54cdbf84SBen Widawsky { 87*54cdbf84SBen Widawsky "endpoint":"endpoint17", 88*54cdbf84SBen Widawsky "host":"mem8", 89*54cdbf84SBen Widawsky "memdev":{ 90*54cdbf84SBen Widawsky "memdev":"mem8", 91*54cdbf84SBen Widawsky "pmem_size":"256.00 MiB (268.44 MB)", 92*54cdbf84SBen Widawsky "ram_size":"256.00 MiB (268.44 MB)", 93*54cdbf84SBen Widawsky "serial":"0x7", 94*54cdbf84SBen Widawsky "numa_node":1, 95*54cdbf84SBen Widawsky "host":"cxl_mem.7" 96*54cdbf84SBen Widawsky } 97*54cdbf84SBen Widawsky }, 98*54cdbf84SBen Widawsky { 99*54cdbf84SBen Widawsky "endpoint":"endpoint13", 100*54cdbf84SBen Widawsky "host":"mem4", 101*54cdbf84SBen Widawsky "memdev":{ 102*54cdbf84SBen Widawsky "memdev":"mem4", 103*54cdbf84SBen Widawsky "pmem_size":"256.00 MiB (268.44 MB)", 104*54cdbf84SBen Widawsky "ram_size":"256.00 MiB (268.44 MB)", 105*54cdbf84SBen Widawsky "serial":"0x3", 106*54cdbf84SBen Widawsky "numa_node":1, 107*54cdbf84SBen Widawsky "host":"cxl_mem.3" 108*54cdbf84SBen Widawsky } 109*54cdbf84SBen Widawsky } 110*54cdbf84SBen Widawsky ] 111*54cdbf84SBen Widawsky } 112*54cdbf84SBen Widawsky ] 113*54cdbf84SBen Widawsky }, 114*54cdbf84SBen Widawsky { 115*54cdbf84SBen Widawsky "port":"port4", 116*54cdbf84SBen Widawsky "host":"cxl_host_bridge.0", 117*54cdbf84SBen Widawsky "ports:port4":[ 118*54cdbf84SBen Widawsky { 119*54cdbf84SBen Widawsky "port":"port6", 120*54cdbf84SBen Widawsky "host":"cxl_switch_uport.0", 121*54cdbf84SBen Widawsky "endpoints:port6":[ 122*54cdbf84SBen Widawsky { 123*54cdbf84SBen Widawsky "endpoint":"endpoint7", 124*54cdbf84SBen Widawsky "host":"mem1", 125*54cdbf84SBen Widawsky "memdev":{ 126*54cdbf84SBen Widawsky "memdev":"mem1", 127*54cdbf84SBen Widawsky "pmem_size":"256.00 MiB (268.44 MB)", 128*54cdbf84SBen Widawsky "ram_size":"256.00 MiB (268.44 MB)", 129*54cdbf84SBen Widawsky "serial":"0", 130*54cdbf84SBen Widawsky "numa_node":0, 131*54cdbf84SBen Widawsky "host":"cxl_mem.0" 132*54cdbf84SBen Widawsky } 133*54cdbf84SBen Widawsky }, 134*54cdbf84SBen Widawsky { 135*54cdbf84SBen Widawsky "endpoint":"endpoint14", 136*54cdbf84SBen Widawsky "host":"mem5", 137*54cdbf84SBen Widawsky "memdev":{ 138*54cdbf84SBen Widawsky "memdev":"mem5", 139*54cdbf84SBen Widawsky "pmem_size":"256.00 MiB (268.44 MB)", 140*54cdbf84SBen Widawsky "ram_size":"256.00 MiB (268.44 MB)", 141*54cdbf84SBen Widawsky "serial":"0x4", 142*54cdbf84SBen Widawsky "numa_node":0, 143*54cdbf84SBen Widawsky "host":"cxl_mem.4" 144*54cdbf84SBen Widawsky } 145*54cdbf84SBen Widawsky } 146*54cdbf84SBen Widawsky ] 147*54cdbf84SBen Widawsky }, 148*54cdbf84SBen Widawsky { 149*54cdbf84SBen Widawsky "port":"port10", 150*54cdbf84SBen Widawsky "host":"cxl_switch_uport.2", 151*54cdbf84SBen Widawsky "endpoints:port10":[ 152*54cdbf84SBen Widawsky { 153*54cdbf84SBen Widawsky "endpoint":"endpoint16", 154*54cdbf84SBen Widawsky "host":"mem7", 155*54cdbf84SBen Widawsky "memdev":{ 156*54cdbf84SBen Widawsky "memdev":"mem7", 157*54cdbf84SBen Widawsky "pmem_size":"256.00 MiB (268.44 MB)", 158*54cdbf84SBen Widawsky "ram_size":"256.00 MiB (268.44 MB)", 159*54cdbf84SBen Widawsky "serial":"0x6", 160*54cdbf84SBen Widawsky "numa_node":0, 161*54cdbf84SBen Widawsky "host":"cxl_mem.6" 162*54cdbf84SBen Widawsky } 163*54cdbf84SBen Widawsky }, 164*54cdbf84SBen Widawsky { 165*54cdbf84SBen Widawsky "endpoint":"endpoint11", 166*54cdbf84SBen Widawsky "host":"mem3", 167*54cdbf84SBen Widawsky "memdev":{ 168*54cdbf84SBen Widawsky "memdev":"mem3", 169*54cdbf84SBen Widawsky "pmem_size":"256.00 MiB (268.44 MB)", 170*54cdbf84SBen Widawsky "ram_size":"256.00 MiB (268.44 MB)", 171*54cdbf84SBen Widawsky "serial":"0x2", 172*54cdbf84SBen Widawsky "numa_node":0, 173*54cdbf84SBen Widawsky "host":"cxl_mem.2" 174*54cdbf84SBen Widawsky } 175*54cdbf84SBen Widawsky } 176*54cdbf84SBen Widawsky ] 177*54cdbf84SBen Widawsky } 178*54cdbf84SBen Widawsky ] 179*54cdbf84SBen Widawsky } 180*54cdbf84SBen Widawsky ] 181*54cdbf84SBen Widawsky } 182*54cdbf84SBen Widawsky 183*54cdbf84SBen WidawskyIn that listing each "root", "port", and "endpoint" object correspond a kernel 184*54cdbf84SBen Widawsky'struct cxl_port' object. A 'cxl_port' is a device that can decode CXL.mem to 185*54cdbf84SBen Widawskyits descendants. So "root" claims non-PCIe enumerable platform decode ranges and 186*54cdbf84SBen Widawskydecodes them to "ports", "ports" decode to "endpoints", and "endpoints" 187*54cdbf84SBen Widawskyrepresent the decode from SPA (System Physical Address) to DPA (Device Physical 188*54cdbf84SBen WidawskyAddress). 189*54cdbf84SBen Widawsky 190*54cdbf84SBen WidawskyContinuing the RAID analogy, disks have both topology metadata and on device 191*54cdbf84SBen Widawskymetadata that determine RAID set assembly. CXL Port topology and CXL Port link 192*54cdbf84SBen Widawskystatus is metadata for CXL.mem set assembly. The CXL Port topology is enumerated 193*54cdbf84SBen Widawskyby the arrival of a CXL.mem device. I.e. unless and until the PCIe core attaches 194*54cdbf84SBen Widawskythe cxl_pci driver to a CXL Memory Expander there is no role for CXL Port 195*54cdbf84SBen Widawskyobjects. Conversely for hot-unplug / removal scenarios, there is no need for 196*54cdbf84SBen Widawskythe Linux PCI core to tear down switch-level CXL resources because the endpoint 197*54cdbf84SBen Widawsky->remove() event cleans up the port data that was established to support that 198*54cdbf84SBen WidawskyMemory Expander. 199*54cdbf84SBen Widawsky 200*54cdbf84SBen WidawskyThe port metadata and potential decode schemes that a give memory device may 201*54cdbf84SBen Widawskyparticipate can be determined via a command like:: 202*54cdbf84SBen Widawsky 203*54cdbf84SBen Widawsky # cxl list -BDMu -d root -m mem3 204*54cdbf84SBen Widawsky { 205*54cdbf84SBen Widawsky "bus":"root3", 206*54cdbf84SBen Widawsky "provider":"cxl_test", 207*54cdbf84SBen Widawsky "decoders:root3":[ 208*54cdbf84SBen Widawsky { 209*54cdbf84SBen Widawsky "decoder":"decoder3.1", 210*54cdbf84SBen Widawsky "resource":"0x8030000000", 211*54cdbf84SBen Widawsky "size":"512.00 MiB (536.87 MB)", 212*54cdbf84SBen Widawsky "volatile_capable":true, 213*54cdbf84SBen Widawsky "nr_targets":2 214*54cdbf84SBen Widawsky }, 215*54cdbf84SBen Widawsky { 216*54cdbf84SBen Widawsky "decoder":"decoder3.3", 217*54cdbf84SBen Widawsky "resource":"0x8060000000", 218*54cdbf84SBen Widawsky "size":"512.00 MiB (536.87 MB)", 219*54cdbf84SBen Widawsky "pmem_capable":true, 220*54cdbf84SBen Widawsky "nr_targets":2 221*54cdbf84SBen Widawsky }, 222*54cdbf84SBen Widawsky { 223*54cdbf84SBen Widawsky "decoder":"decoder3.0", 224*54cdbf84SBen Widawsky "resource":"0x8020000000", 225*54cdbf84SBen Widawsky "size":"256.00 MiB (268.44 MB)", 226*54cdbf84SBen Widawsky "volatile_capable":true, 227*54cdbf84SBen Widawsky "nr_targets":1 228*54cdbf84SBen Widawsky }, 229*54cdbf84SBen Widawsky { 230*54cdbf84SBen Widawsky "decoder":"decoder3.2", 231*54cdbf84SBen Widawsky "resource":"0x8050000000", 232*54cdbf84SBen Widawsky "size":"256.00 MiB (268.44 MB)", 233*54cdbf84SBen Widawsky "pmem_capable":true, 234*54cdbf84SBen Widawsky "nr_targets":1 235*54cdbf84SBen Widawsky } 236*54cdbf84SBen Widawsky ], 237*54cdbf84SBen Widawsky "memdevs:root3":[ 238*54cdbf84SBen Widawsky { 239*54cdbf84SBen Widawsky "memdev":"mem3", 240*54cdbf84SBen Widawsky "pmem_size":"256.00 MiB (268.44 MB)", 241*54cdbf84SBen Widawsky "ram_size":"256.00 MiB (268.44 MB)", 242*54cdbf84SBen Widawsky "serial":"0x2", 243*54cdbf84SBen Widawsky "numa_node":0, 244*54cdbf84SBen Widawsky "host":"cxl_mem.2" 245*54cdbf84SBen Widawsky } 246*54cdbf84SBen Widawsky ] 247*54cdbf84SBen Widawsky } 248*54cdbf84SBen Widawsky 249*54cdbf84SBen Widawsky...which queries the CXL topology to ask "given CXL Memory Expander with a kernel 250*54cdbf84SBen Widawskydevice name of 'mem3' which platform level decode ranges may this device 251*54cdbf84SBen Widawskyparticipate". A given expander can participate in multiple CXL.mem interleave 252*54cdbf84SBen Widawskysets simultaneously depending on how many decoder resource it has. In this 253*54cdbf84SBen Widawskyexample mem3 can participate in one or more of a PMEM interleave that spans to 254*54cdbf84SBen WidawskyHost Bridges, a PMEM interleave that targets a single Host Bridge, a Volatile 255*54cdbf84SBen Widawskymemory interleave that spans 2 Host Bridges, and a Volatile memory interleave 256*54cdbf84SBen Widawskythat only targets a single Host Bridge. 257*54cdbf84SBen Widawsky 258*54cdbf84SBen WidawskyConversely the memory devices that can participate in a given platform level 259*54cdbf84SBen Widawskydecode scheme can be determined via a command like the following:: 260*54cdbf84SBen Widawsky 261*54cdbf84SBen Widawsky # cxl list -MDu -d 3.2 262*54cdbf84SBen Widawsky [ 263*54cdbf84SBen Widawsky { 264*54cdbf84SBen Widawsky "memdevs":[ 265*54cdbf84SBen Widawsky { 266*54cdbf84SBen Widawsky "memdev":"mem1", 267*54cdbf84SBen Widawsky "pmem_size":"256.00 MiB (268.44 MB)", 268*54cdbf84SBen Widawsky "ram_size":"256.00 MiB (268.44 MB)", 269*54cdbf84SBen Widawsky "serial":"0", 270*54cdbf84SBen Widawsky "numa_node":0, 271*54cdbf84SBen Widawsky "host":"cxl_mem.0" 272*54cdbf84SBen Widawsky }, 273*54cdbf84SBen Widawsky { 274*54cdbf84SBen Widawsky "memdev":"mem5", 275*54cdbf84SBen Widawsky "pmem_size":"256.00 MiB (268.44 MB)", 276*54cdbf84SBen Widawsky "ram_size":"256.00 MiB (268.44 MB)", 277*54cdbf84SBen Widawsky "serial":"0x4", 278*54cdbf84SBen Widawsky "numa_node":0, 279*54cdbf84SBen Widawsky "host":"cxl_mem.4" 280*54cdbf84SBen Widawsky }, 281*54cdbf84SBen Widawsky { 282*54cdbf84SBen Widawsky "memdev":"mem7", 283*54cdbf84SBen Widawsky "pmem_size":"256.00 MiB (268.44 MB)", 284*54cdbf84SBen Widawsky "ram_size":"256.00 MiB (268.44 MB)", 285*54cdbf84SBen Widawsky "serial":"0x6", 286*54cdbf84SBen Widawsky "numa_node":0, 287*54cdbf84SBen Widawsky "host":"cxl_mem.6" 288*54cdbf84SBen Widawsky }, 289*54cdbf84SBen Widawsky { 290*54cdbf84SBen Widawsky "memdev":"mem3", 291*54cdbf84SBen Widawsky "pmem_size":"256.00 MiB (268.44 MB)", 292*54cdbf84SBen Widawsky "ram_size":"256.00 MiB (268.44 MB)", 293*54cdbf84SBen Widawsky "serial":"0x2", 294*54cdbf84SBen Widawsky "numa_node":0, 295*54cdbf84SBen Widawsky "host":"cxl_mem.2" 296*54cdbf84SBen Widawsky } 297*54cdbf84SBen Widawsky ] 298*54cdbf84SBen Widawsky }, 299*54cdbf84SBen Widawsky { 300*54cdbf84SBen Widawsky "root decoders":[ 301*54cdbf84SBen Widawsky { 302*54cdbf84SBen Widawsky "decoder":"decoder3.2", 303*54cdbf84SBen Widawsky "resource":"0x8050000000", 304*54cdbf84SBen Widawsky "size":"256.00 MiB (268.44 MB)", 305*54cdbf84SBen Widawsky "pmem_capable":true, 306*54cdbf84SBen Widawsky "nr_targets":1 307*54cdbf84SBen Widawsky } 308*54cdbf84SBen Widawsky ] 309*54cdbf84SBen Widawsky } 310*54cdbf84SBen Widawsky ] 311*54cdbf84SBen Widawsky 312*54cdbf84SBen Widawsky...where the naming scheme for decoders is "decoder<port_id>.<instance_id>". 313*54cdbf84SBen Widawsky 3148adaf747SBen WidawskyDriver Infrastructure 3158adaf747SBen Widawsky===================== 3168adaf747SBen Widawsky 3178adaf747SBen WidawskyThis section covers the driver infrastructure for a CXL memory device. 3188adaf747SBen Widawsky 3198adaf747SBen WidawskyCXL Memory Device 3208adaf747SBen Widawsky----------------- 3218adaf747SBen Widawsky 32221e9f767SBen Widawsky.. kernel-doc:: drivers/cxl/pci.c 32321e9f767SBen Widawsky :doc: cxl pci 3248adaf747SBen Widawsky 32521e9f767SBen Widawsky.. kernel-doc:: drivers/cxl/pci.c 3268adaf747SBen Widawsky :internal: 327b39cb105SDan Williams 328*54cdbf84SBen WidawskyCXL Port 329*54cdbf84SBen Widawsky-------- 330*54cdbf84SBen Widawsky.. kernel-doc:: drivers/cxl/port.c 331*54cdbf84SBen Widawsky :doc: cxl port 332*54cdbf84SBen Widawsky 3335f653f75SDan WilliamsCXL Core 33435c32e30SDan Williams-------- 3354812be97SDan Williams.. kernel-doc:: drivers/cxl/cxl.h 3364812be97SDan Williams :doc: cxl objects 3374812be97SDan Williams 3384812be97SDan Williams.. kernel-doc:: drivers/cxl/cxl.h 3394812be97SDan Williams :internal: 3404812be97SDan Williams 3410ff0af18SDan Williams.. kernel-doc:: drivers/cxl/core/port.c 3425f653f75SDan Williams :doc: cxl core 343583fa5e7SBen Widawsky 3440ff0af18SDan Williams.. kernel-doc:: drivers/cxl/core/port.c 345fa9a7d2dSBen Widawsky :identifiers: 346fa9a7d2dSBen Widawsky 34706737cd0SDan Williams.. kernel-doc:: drivers/cxl/core/pmem.c 348a01da6caSDan Williams :doc: cxl pmem 34906737cd0SDan Williams 3500f06157eSDan Williams.. kernel-doc:: drivers/cxl/core/regs.c 3512b922a9dSDan Williams :doc: cxl registers 3520f06157eSDan Williams 3534faf31b4SDan Williams.. kernel-doc:: drivers/cxl/core/mbox.c 3544faf31b4SDan Williams :doc: cxl mbox 3554faf31b4SDan Williams 356583fa5e7SBen WidawskyExternal Interfaces 357583fa5e7SBen Widawsky=================== 358583fa5e7SBen Widawsky 359583fa5e7SBen WidawskyCXL IOCTL Interface 360583fa5e7SBen Widawsky------------------- 361583fa5e7SBen Widawsky 362583fa5e7SBen Widawsky.. kernel-doc:: include/uapi/linux/cxl_mem.h 363583fa5e7SBen Widawsky :doc: UAPI 364583fa5e7SBen Widawsky 365583fa5e7SBen Widawsky.. kernel-doc:: include/uapi/linux/cxl_mem.h 366583fa5e7SBen Widawsky :internal: 367