16634fbb6SMauro Carvalho ChehabError Detection And Correction (EDAC) Devices 26634fbb6SMauro Carvalho Chehab============================================= 36634fbb6SMauro Carvalho Chehab 46b1fb6f7SMauro Carvalho ChehabMain Concepts used at the EDAC subsystem 56b1fb6f7SMauro Carvalho Chehab---------------------------------------- 66b1fb6f7SMauro Carvalho Chehab 76b1fb6f7SMauro Carvalho ChehabThere are several things to be aware of that aren't at all obvious, like 86b1fb6f7SMauro Carvalho Chehab*sockets, *socket sets*, *banks*, *rows*, *chip-select rows*, *channels*, 96b1fb6f7SMauro Carvalho Chehabetc... 106b1fb6f7SMauro Carvalho Chehab 116b1fb6f7SMauro Carvalho ChehabThese are some of the many terms that are thrown about that don't always 126b1fb6f7SMauro Carvalho Chehabmean what people think they mean (Inconceivable!). In the interest of 136b1fb6f7SMauro Carvalho Chehabcreating a common ground for discussion, terms and their definitions 146b1fb6f7SMauro Carvalho Chehabwill be established. 156b1fb6f7SMauro Carvalho Chehab 166b1fb6f7SMauro Carvalho Chehab* Memory devices 176b1fb6f7SMauro Carvalho Chehab 186b1fb6f7SMauro Carvalho ChehabThe individual DRAM chips on a memory stick. These devices commonly 196b1fb6f7SMauro Carvalho Chehaboutput 4 and 8 bits each (x4, x8). Grouping several of these in parallel 206b1fb6f7SMauro Carvalho Chehabprovides the number of bits that the memory controller expects: 216b1fb6f7SMauro Carvalho Chehabtypically 72 bits, in order to provide 64 bits + 8 bits of ECC data. 226b1fb6f7SMauro Carvalho Chehab 236b1fb6f7SMauro Carvalho Chehab* Memory Stick 246b1fb6f7SMauro Carvalho Chehab 256b1fb6f7SMauro Carvalho ChehabA printed circuit board that aggregates multiple memory devices in 266b1fb6f7SMauro Carvalho Chehabparallel. In general, this is the Field Replaceable Unit (FRU) which 276b1fb6f7SMauro Carvalho Chehabgets replaced, in the case of excessive errors. Most often it is also 286b1fb6f7SMauro Carvalho Chehabcalled DIMM (Dual Inline Memory Module). 296b1fb6f7SMauro Carvalho Chehab 306b1fb6f7SMauro Carvalho Chehab* Memory Socket 316b1fb6f7SMauro Carvalho Chehab 326b1fb6f7SMauro Carvalho ChehabA physical connector on the motherboard that accepts a single memory 336b1fb6f7SMauro Carvalho Chehabstick. Also called as "slot" on several datasheets. 346b1fb6f7SMauro Carvalho Chehab 356b1fb6f7SMauro Carvalho Chehab* Channel 366b1fb6f7SMauro Carvalho Chehab 376b1fb6f7SMauro Carvalho ChehabA memory controller channel, responsible to communicate with a group of 386b1fb6f7SMauro Carvalho ChehabDIMMs. Each channel has its own independent control (command) and data 396b1fb6f7SMauro Carvalho Chehabbus, and can be used independently or grouped with other channels. 406b1fb6f7SMauro Carvalho Chehab 416b1fb6f7SMauro Carvalho Chehab* Branch 426b1fb6f7SMauro Carvalho Chehab 436b1fb6f7SMauro Carvalho ChehabIt is typically the highest hierarchy on a Fully-Buffered DIMM memory 446b1fb6f7SMauro Carvalho Chehabcontroller. Typically, it contains two channels. Two channels at the 456b1fb6f7SMauro Carvalho Chehabsame branch can be used in single mode or in lockstep mode. When 466b1fb6f7SMauro Carvalho Chehablockstep is enabled, the cacheline is doubled, but it generally brings 476b1fb6f7SMauro Carvalho Chehabsome performance penalty. Also, it is generally not possible to point to 486b1fb6f7SMauro Carvalho Chehabjust one memory stick when an error occurs, as the error correction code 496b1fb6f7SMauro Carvalho Chehabis calculated using two DIMMs instead of one. Due to that, it is capable 506b1fb6f7SMauro Carvalho Chehabof correcting more errors than on single mode. 516b1fb6f7SMauro Carvalho Chehab 526b1fb6f7SMauro Carvalho Chehab* Single-channel 536b1fb6f7SMauro Carvalho Chehab 546b1fb6f7SMauro Carvalho ChehabThe data accessed by the memory controller is contained into one dimm 556b1fb6f7SMauro Carvalho Chehabonly. E. g. if the data is 64 bits-wide, the data flows to the CPU using 566b1fb6f7SMauro Carvalho Chehabone 64 bits parallel access. Typically used with SDR, DDR, DDR2 and DDR3 576b1fb6f7SMauro Carvalho Chehabmemories. FB-DIMM and RAMBUS use a different concept for channel, so 586b1fb6f7SMauro Carvalho Chehabthis concept doesn't apply there. 596b1fb6f7SMauro Carvalho Chehab 606b1fb6f7SMauro Carvalho Chehab* Double-channel 616b1fb6f7SMauro Carvalho Chehab 626b1fb6f7SMauro Carvalho ChehabThe data size accessed by the memory controller is interlaced into two 636b1fb6f7SMauro Carvalho Chehabdimms, accessed at the same time. E. g. if the DIMM is 64 bits-wide (72 646b1fb6f7SMauro Carvalho Chehabbits with ECC), the data flows to the CPU using a 128 bits parallel 656b1fb6f7SMauro Carvalho Chehabaccess. 666b1fb6f7SMauro Carvalho Chehab 676b1fb6f7SMauro Carvalho Chehab* Chip-select row 686b1fb6f7SMauro Carvalho Chehab 696b1fb6f7SMauro Carvalho ChehabThis is the name of the DRAM signal used to select the DRAM ranks to be 706b1fb6f7SMauro Carvalho Chehabaccessed. Common chip-select rows for single channel are 64 bits, for 716b1fb6f7SMauro Carvalho Chehabdual channel 128 bits. It may not be visible by the memory controller, 726b1fb6f7SMauro Carvalho Chehabas some DIMM types have a memory buffer that can hide direct access to 736b1fb6f7SMauro Carvalho Chehabit from the Memory Controller. 746b1fb6f7SMauro Carvalho Chehab 756b1fb6f7SMauro Carvalho Chehab* Single-Ranked stick 766b1fb6f7SMauro Carvalho Chehab 776b1fb6f7SMauro Carvalho ChehabA Single-ranked stick has 1 chip-select row of memory. Motherboards 786b1fb6f7SMauro Carvalho Chehabcommonly drive two chip-select pins to a memory stick. A single-ranked 796b1fb6f7SMauro Carvalho Chehabstick, will occupy only one of those rows. The other will be unused. 806b1fb6f7SMauro Carvalho Chehab 816b1fb6f7SMauro Carvalho Chehab.. _doubleranked: 826b1fb6f7SMauro Carvalho Chehab 836b1fb6f7SMauro Carvalho Chehab* Double-Ranked stick 846b1fb6f7SMauro Carvalho Chehab 856b1fb6f7SMauro Carvalho ChehabA double-ranked stick has two chip-select rows which access different 866b1fb6f7SMauro Carvalho Chehabsets of memory devices. The two rows cannot be accessed concurrently. 876b1fb6f7SMauro Carvalho Chehab 886b1fb6f7SMauro Carvalho Chehab* Double-sided stick 896b1fb6f7SMauro Carvalho Chehab 906b1fb6f7SMauro Carvalho Chehab**DEPRECATED TERM**, see :ref:`Double-Ranked stick <doubleranked>`. 916b1fb6f7SMauro Carvalho Chehab 926b1fb6f7SMauro Carvalho ChehabA double-sided stick has two chip-select rows which access different sets 936b1fb6f7SMauro Carvalho Chehabof memory devices. The two rows cannot be accessed concurrently. 946b1fb6f7SMauro Carvalho Chehab"Double-sided" is irrespective of the memory devices being mounted on 956b1fb6f7SMauro Carvalho Chehabboth sides of the memory stick. 966b1fb6f7SMauro Carvalho Chehab 976b1fb6f7SMauro Carvalho Chehab* Socket set 986b1fb6f7SMauro Carvalho Chehab 996b1fb6f7SMauro Carvalho ChehabAll of the memory sticks that are required for a single memory access or 1006b1fb6f7SMauro Carvalho Chehaball of the memory sticks spanned by a chip-select row. A single socket 1016b1fb6f7SMauro Carvalho Chehabset has two chip-select rows and if double-sided sticks are used these 1026b1fb6f7SMauro Carvalho Chehabwill occupy those chip-select rows. 1036b1fb6f7SMauro Carvalho Chehab 1046b1fb6f7SMauro Carvalho Chehab* Bank 1056b1fb6f7SMauro Carvalho Chehab 1066b1fb6f7SMauro Carvalho ChehabThis term is avoided because it is unclear when needing to distinguish 1076b1fb6f7SMauro Carvalho Chehabbetween chip-select rows and socket sets. 1086b1fb6f7SMauro Carvalho Chehab 109*4f3fa571SMuralidhara M K* High Bandwidth Memory (HBM) 110*4f3fa571SMuralidhara M K 111*4f3fa571SMuralidhara M KHBM is a new memory type with low power consumption and ultra-wide 112*4f3fa571SMuralidhara M Kcommunication lanes. It uses vertically stacked memory chips (DRAM dies) 113*4f3fa571SMuralidhara M Kinterconnected by microscopic wires called "through-silicon vias," or 114*4f3fa571SMuralidhara M KTSVs. 115*4f3fa571SMuralidhara M K 116*4f3fa571SMuralidhara M KSeveral stacks of HBM chips connect to the CPU or GPU through an ultra-fast 117*4f3fa571SMuralidhara M Kinterconnect called the "interposer". Therefore, HBM's characteristics 118*4f3fa571SMuralidhara M Kare nearly indistinguishable from on-chip integrated RAM. 1196b1fb6f7SMauro Carvalho Chehab 1206634fbb6SMauro Carvalho ChehabMemory Controllers 1216634fbb6SMauro Carvalho Chehab------------------ 1226634fbb6SMauro Carvalho Chehab 1236634fbb6SMauro Carvalho ChehabMost of the EDAC core is focused on doing Memory Controller error detection. 1246634fbb6SMauro Carvalho ChehabThe :c:func:`edac_mc_alloc`. It uses internally the struct ``mem_ctl_info`` 1256634fbb6SMauro Carvalho Chehabto describe the memory controllers, with is an opaque struct for the EDAC 1266634fbb6SMauro Carvalho Chehabdrivers. Only the EDAC core is allowed to touch it. 1276634fbb6SMauro Carvalho Chehab 1286634fbb6SMauro Carvalho Chehab.. kernel-doc:: include/linux/edac.h 1296634fbb6SMauro Carvalho Chehab 1306634fbb6SMauro Carvalho Chehab.. kernel-doc:: drivers/edac/edac_mc.h 1316634fbb6SMauro Carvalho Chehab 1326634fbb6SMauro Carvalho ChehabPCI Controllers 1336634fbb6SMauro Carvalho Chehab--------------- 1346634fbb6SMauro Carvalho Chehab 1356634fbb6SMauro Carvalho ChehabThe EDAC subsystem provides a mechanism to handle PCI controllers by calling 1366634fbb6SMauro Carvalho Chehabthe :c:func:`edac_pci_alloc_ctl_info`. It will use the struct 1376634fbb6SMauro Carvalho Chehab:c:type:`edac_pci_ctl_info` to describe the PCI controllers. 1386634fbb6SMauro Carvalho Chehab 1396634fbb6SMauro Carvalho Chehab.. kernel-doc:: drivers/edac/edac_pci.h 1406634fbb6SMauro Carvalho Chehab 1416634fbb6SMauro Carvalho ChehabEDAC Blocks 1426634fbb6SMauro Carvalho Chehab----------- 1436634fbb6SMauro Carvalho Chehab 1446634fbb6SMauro Carvalho ChehabThe EDAC subsystem also provides a generic mechanism to report errors on 1456634fbb6SMauro Carvalho Chehabother parts of the hardware via :c:func:`edac_device_alloc_ctl_info` function. 1466634fbb6SMauro Carvalho Chehab 1476634fbb6SMauro Carvalho ChehabThe structures :c:type:`edac_dev_sysfs_block_attribute`, 1486634fbb6SMauro Carvalho Chehab:c:type:`edac_device_block`, :c:type:`edac_device_instance` and 1496634fbb6SMauro Carvalho Chehab:c:type:`edac_device_ctl_info` provide a generic or abstract 'edac_device' 1506634fbb6SMauro Carvalho Chehabrepresentation at sysfs. 1516634fbb6SMauro Carvalho Chehab 1526634fbb6SMauro Carvalho ChehabThis set of structures and the code that implements the APIs for the same, provide for registering EDAC type devices which are NOT standard memory or 1536634fbb6SMauro Carvalho ChehabPCI, like: 1546634fbb6SMauro Carvalho Chehab 1556634fbb6SMauro Carvalho Chehab- CPU caches (L1 and L2) 1566634fbb6SMauro Carvalho Chehab- DMA engines 1576634fbb6SMauro Carvalho Chehab- Core CPU switches 1586634fbb6SMauro Carvalho Chehab- Fabric switch units 1596634fbb6SMauro Carvalho Chehab- PCIe interface controllers 1606634fbb6SMauro Carvalho Chehab- other EDAC/ECC type devices that can be monitored for 1616634fbb6SMauro Carvalho Chehab errors, etc. 1626634fbb6SMauro Carvalho Chehab 1636634fbb6SMauro Carvalho ChehabIt allows for a 2 level set of hierarchy. 1646634fbb6SMauro Carvalho Chehab 1656634fbb6SMauro Carvalho ChehabFor example, a cache could be composed of L1, L2 and L3 levels of cache. 1666634fbb6SMauro Carvalho ChehabEach CPU core would have its own L1 cache, while sharing L2 and maybe L3 1676634fbb6SMauro Carvalho Chehabcaches. On such case, those can be represented via the following sysfs 1686634fbb6SMauro Carvalho Chehabnodes:: 1696634fbb6SMauro Carvalho Chehab 1706634fbb6SMauro Carvalho Chehab /sys/devices/system/edac/.. 1716634fbb6SMauro Carvalho Chehab 1726634fbb6SMauro Carvalho Chehab pci/ <existing pci directory (if available)> 1736634fbb6SMauro Carvalho Chehab mc/ <existing memory device directory> 1746634fbb6SMauro Carvalho Chehab cpu/cpu0/.. <L1 and L2 block directory> 1756634fbb6SMauro Carvalho Chehab /L1-cache/ce_count 1766634fbb6SMauro Carvalho Chehab /ue_count 1776634fbb6SMauro Carvalho Chehab /L2-cache/ce_count 1786634fbb6SMauro Carvalho Chehab /ue_count 1796634fbb6SMauro Carvalho Chehab cpu/cpu1/.. <L1 and L2 block directory> 1806634fbb6SMauro Carvalho Chehab /L1-cache/ce_count 1816634fbb6SMauro Carvalho Chehab /ue_count 1826634fbb6SMauro Carvalho Chehab /L2-cache/ce_count 1836634fbb6SMauro Carvalho Chehab /ue_count 1846634fbb6SMauro Carvalho Chehab ... 1856634fbb6SMauro Carvalho Chehab 1866634fbb6SMauro Carvalho Chehab the L1 and L2 directories would be "edac_device_block's" 1876634fbb6SMauro Carvalho Chehab 1886634fbb6SMauro Carvalho Chehab.. kernel-doc:: drivers/edac/edac_device.h 189*4f3fa571SMuralidhara M K 190*4f3fa571SMuralidhara M K 191*4f3fa571SMuralidhara M KHeterogeneous system support 192*4f3fa571SMuralidhara M K---------------------------- 193*4f3fa571SMuralidhara M K 194*4f3fa571SMuralidhara M KAn AMD heterogeneous system is built by connecting the data fabrics of 195*4f3fa571SMuralidhara M Kboth CPUs and GPUs via custom xGMI links. Thus, the data fabric on the 196*4f3fa571SMuralidhara M KGPU nodes can be accessed the same way as the data fabric on CPU nodes. 197*4f3fa571SMuralidhara M K 198*4f3fa571SMuralidhara M KThe MI200 accelerators are data center GPUs. They have 2 data fabrics, 199*4f3fa571SMuralidhara M Kand each GPU data fabric contains four Unified Memory Controllers (UMC). 200*4f3fa571SMuralidhara M KEach UMC contains eight channels. Each UMC channel controls one 128-bit 201*4f3fa571SMuralidhara M KHBM2e (2GB) channel (equivalent to 8 X 2GB ranks). This creates a total 202*4f3fa571SMuralidhara M Kof 4096-bits of DRAM data bus. 203*4f3fa571SMuralidhara M K 204*4f3fa571SMuralidhara M KWhile the UMC is interfacing a 16GB (8high X 2GB DRAM) HBM stack, each UMC 205*4f3fa571SMuralidhara M Kchannel is interfacing 2GB of DRAM (represented as rank). 206*4f3fa571SMuralidhara M K 207*4f3fa571SMuralidhara M KMemory controllers on AMD GPU nodes can be represented in EDAC thusly: 208*4f3fa571SMuralidhara M K 209*4f3fa571SMuralidhara M K GPU DF / GPU Node -> EDAC MC 210*4f3fa571SMuralidhara M K GPU UMC -> EDAC CSROW 211*4f3fa571SMuralidhara M K GPU UMC channel -> EDAC CHANNEL 212*4f3fa571SMuralidhara M K 213*4f3fa571SMuralidhara M KFor example: a heterogeneous system with 1 AMD CPU is connected to 214*4f3fa571SMuralidhara M K4 MI200 (Aldebaran) GPUs using xGMI. 215*4f3fa571SMuralidhara M K 216*4f3fa571SMuralidhara M KSome more heterogeneous hardware details: 217*4f3fa571SMuralidhara M K 218*4f3fa571SMuralidhara M K- The CPU UMC (Unified Memory Controller) is mostly the same as the GPU UMC. 219*4f3fa571SMuralidhara M K They have chip selects (csrows) and channels. However, the layouts are different 220*4f3fa571SMuralidhara M K for performance, physical layout, or other reasons. 221*4f3fa571SMuralidhara M K- CPU UMCs use 1 channel, In this case UMC = EDAC channel. This follows the 222*4f3fa571SMuralidhara M K marketing speak. CPU has X memory channels, etc. 223*4f3fa571SMuralidhara M K- CPU UMCs use up to 4 chip selects, So UMC chip select = EDAC CSROW. 224*4f3fa571SMuralidhara M K- GPU UMCs use 1 chip select, So UMC = EDAC CSROW. 225*4f3fa571SMuralidhara M K- GPU UMCs use 8 channels, So UMC channel = EDAC channel. 226*4f3fa571SMuralidhara M K 227*4f3fa571SMuralidhara M KThe EDAC subsystem provides a mechanism to handle AMD heterogeneous 228*4f3fa571SMuralidhara M Ksystems by calling system specific ops for both CPUs and GPUs. 229*4f3fa571SMuralidhara M K 230*4f3fa571SMuralidhara M KAMD GPU nodes are enumerated in sequential order based on the PCI 231*4f3fa571SMuralidhara M Khierarchy, and the first GPU node is assumed to have a Node ID value 232*4f3fa571SMuralidhara M Kfollowing those of the CPU nodes after latter are fully populated:: 233*4f3fa571SMuralidhara M K 234*4f3fa571SMuralidhara M K $ ls /sys/devices/system/edac/mc/ 235*4f3fa571SMuralidhara M K mc0 - CPU MC node 0 236*4f3fa571SMuralidhara M K mc1 | 237*4f3fa571SMuralidhara M K mc2 |- GPU card[0] => node 0(mc1), node 1(mc2) 238*4f3fa571SMuralidhara M K mc3 | 239*4f3fa571SMuralidhara M K mc4 |- GPU card[1] => node 0(mc3), node 1(mc4) 240*4f3fa571SMuralidhara M K mc5 | 241*4f3fa571SMuralidhara M K mc6 |- GPU card[2] => node 0(mc5), node 1(mc6) 242*4f3fa571SMuralidhara M K mc7 | 243*4f3fa571SMuralidhara M K mc8 |- GPU card[3] => node 0(mc7), node 1(mc8) 244*4f3fa571SMuralidhara M K 245*4f3fa571SMuralidhara M KFor example, a heterogeneous system with one AMD CPU is connected to 246*4f3fa571SMuralidhara M Kfour MI200 (Aldebaran) GPUs using xGMI. This topology can be represented 247*4f3fa571SMuralidhara M Kvia the following sysfs entries:: 248*4f3fa571SMuralidhara M K 249*4f3fa571SMuralidhara M K /sys/devices/system/edac/mc/.. 250*4f3fa571SMuralidhara M K 251*4f3fa571SMuralidhara M K CPU # CPU node 252*4f3fa571SMuralidhara M K ├── mc 0 253*4f3fa571SMuralidhara M K 254*4f3fa571SMuralidhara M K GPU Nodes are enumerated sequentially after CPU nodes have been populated 255*4f3fa571SMuralidhara M K GPU card 1 # Each MI200 GPU has 2 nodes/mcs 256*4f3fa571SMuralidhara M K ├── mc 1 # GPU node 0 == mc1, Each MC node has 4 UMCs/CSROWs 257*4f3fa571SMuralidhara M K │ ├── csrow 0 # UMC 0 258*4f3fa571SMuralidhara M K │ │ ├── channel 0 # Each UMC has 8 channels 259*4f3fa571SMuralidhara M K │ │ ├── channel 1 # size of each channel is 2 GB, so each UMC has 16 GB 260*4f3fa571SMuralidhara M K │ │ ├── channel 2 261*4f3fa571SMuralidhara M K │ │ ├── channel 3 262*4f3fa571SMuralidhara M K │ │ ├── channel 4 263*4f3fa571SMuralidhara M K │ │ ├── channel 5 264*4f3fa571SMuralidhara M K │ │ ├── channel 6 265*4f3fa571SMuralidhara M K │ │ ├── channel 7 266*4f3fa571SMuralidhara M K │ ├── csrow 1 # UMC 1 267*4f3fa571SMuralidhara M K │ │ ├── channel 0 268*4f3fa571SMuralidhara M K │ │ ├── .. 269*4f3fa571SMuralidhara M K │ │ ├── channel 7 270*4f3fa571SMuralidhara M K │ ├── .. .. 271*4f3fa571SMuralidhara M K │ ├── csrow 3 # UMC 3 272*4f3fa571SMuralidhara M K │ │ ├── channel 0 273*4f3fa571SMuralidhara M K │ │ ├── .. 274*4f3fa571SMuralidhara M K │ │ ├── channel 7 275*4f3fa571SMuralidhara M K │ ├── rank 0 276*4f3fa571SMuralidhara M K │ ├── .. .. 277*4f3fa571SMuralidhara M K │ ├── rank 31 # total 32 ranks/dimms from 4 UMCs 278*4f3fa571SMuralidhara M K ├ 279*4f3fa571SMuralidhara M K ├── mc 2 # GPU node 1 == mc2 280*4f3fa571SMuralidhara M K │ ├── .. # each GPU has total 64 GB 281*4f3fa571SMuralidhara M K 282*4f3fa571SMuralidhara M K GPU card 2 283*4f3fa571SMuralidhara M K ├── mc 3 284*4f3fa571SMuralidhara M K │ ├── .. 285*4f3fa571SMuralidhara M K ├── mc 4 286*4f3fa571SMuralidhara M K │ ├── .. 287*4f3fa571SMuralidhara M K 288*4f3fa571SMuralidhara M K GPU card 3 289*4f3fa571SMuralidhara M K ├── mc 5 290*4f3fa571SMuralidhara M K │ ├── .. 291*4f3fa571SMuralidhara M K ├── mc 6 292*4f3fa571SMuralidhara M K │ ├── .. 293*4f3fa571SMuralidhara M K 294*4f3fa571SMuralidhara M K GPU card 4 295*4f3fa571SMuralidhara M K ├── mc 7 296*4f3fa571SMuralidhara M K │ ├── .. 297*4f3fa571SMuralidhara M K ├── mc 8 298*4f3fa571SMuralidhara M K │ ├── .. 299