18a8a602fSJonathan Corbet.. Copyright 2001 Matthew Wilcox 28a8a602fSJonathan Corbet.. 38a8a602fSJonathan Corbet.. This documentation is free software; you can redistribute 48a8a602fSJonathan Corbet.. it and/or modify it under the terms of the GNU General Public 58a8a602fSJonathan Corbet.. License as published by the Free Software Foundation; either 68a8a602fSJonathan Corbet.. version 2 of the License, or (at your option) any later 78a8a602fSJonathan Corbet.. version. 88a8a602fSJonathan Corbet 98a8a602fSJonathan Corbet=============================== 108a8a602fSJonathan CorbetBus-Independent Device Accesses 118a8a602fSJonathan Corbet=============================== 128a8a602fSJonathan Corbet 138a8a602fSJonathan Corbet:Author: Matthew Wilcox 148a8a602fSJonathan Corbet:Author: Alan Cox 158a8a602fSJonathan Corbet 168a8a602fSJonathan CorbetIntroduction 178a8a602fSJonathan Corbet============ 188a8a602fSJonathan Corbet 198a8a602fSJonathan CorbetLinux provides an API which abstracts performing IO across all busses 208a8a602fSJonathan Corbetand devices, allowing device drivers to be written independently of bus 218a8a602fSJonathan Corbettype. 228a8a602fSJonathan Corbet 238a8a602fSJonathan CorbetMemory Mapped IO 248a8a602fSJonathan Corbet================ 258a8a602fSJonathan Corbet 268a8a602fSJonathan CorbetGetting Access to the Device 278a8a602fSJonathan Corbet---------------------------- 288a8a602fSJonathan Corbet 298a8a602fSJonathan CorbetThe most widely supported form of IO is memory mapped IO. That is, a 308a8a602fSJonathan Corbetpart of the CPU's address space is interpreted not as accesses to 318a8a602fSJonathan Corbetmemory, but as accesses to a device. Some architectures define devices 328a8a602fSJonathan Corbetto be at a fixed address, but most have some method of discovering 338a8a602fSJonathan Corbetdevices. The PCI bus walk is a good example of such a scheme. This 348a8a602fSJonathan Corbetdocument does not cover how to receive such an address, but assumes you 358a8a602fSJonathan Corbetare starting with one. Physical addresses are of type unsigned long. 368a8a602fSJonathan Corbet 378a8a602fSJonathan CorbetThis address should not be used directly. Instead, to get an address 388a8a602fSJonathan Corbetsuitable for passing to the accessor functions described below, you 393a63e9fcSPuranjay Mohanshould call ioremap(). An address suitable for accessing 408a8a602fSJonathan Corbetthe device will be returned to you. 418a8a602fSJonathan Corbet 428a8a602fSJonathan CorbetAfter you've finished using the device (say, in your module's exit 433a63e9fcSPuranjay Mohanroutine), call iounmap() in order to return the address 448a8a602fSJonathan Corbetspace to the kernel. Most architectures allocate new address space each 453a63e9fcSPuranjay Mohantime you call ioremap(), and they can run out unless you 463a63e9fcSPuranjay Mohancall iounmap(). 478a8a602fSJonathan Corbet 488a8a602fSJonathan CorbetAccessing the device 498a8a602fSJonathan Corbet-------------------- 508a8a602fSJonathan Corbet 518a8a602fSJonathan CorbetThe part of the interface most used by drivers is reading and writing 528a8a602fSJonathan Corbetmemory-mapped registers on the device. Linux provides interfaces to read 538a8a602fSJonathan Corbetand write 8-bit, 16-bit, 32-bit and 64-bit quantities. Due to a 548a8a602fSJonathan Corbethistorical accident, these are named byte, word, long and quad accesses. 558a8a602fSJonathan CorbetBoth read and write accesses are supported; there is no prefetch support 568a8a602fSJonathan Corbetat this time. 578a8a602fSJonathan Corbet 588a8a602fSJonathan CorbetThe functions are named readb(), readw(), readl(), readq(), 598a8a602fSJonathan Corbetreadb_relaxed(), readw_relaxed(), readl_relaxed(), readq_relaxed(), 608a8a602fSJonathan Corbetwriteb(), writew(), writel() and writeq(). 618a8a602fSJonathan Corbet 628a8a602fSJonathan CorbetSome devices (such as framebuffers) would like to use larger transfers than 633a63e9fcSPuranjay Mohan8 bytes at a time. For these devices, the memcpy_toio(), 643a63e9fcSPuranjay Mohanmemcpy_fromio() and memset_io() functions are 658a8a602fSJonathan Corbetprovided. Do not use memset or memcpy on IO addresses; they are not 668a8a602fSJonathan Corbetguaranteed to copy data in order. 678a8a602fSJonathan Corbet 688a8a602fSJonathan CorbetThe read and write functions are defined to be ordered. That is the 698a8a602fSJonathan Corbetcompiler is not permitted to reorder the I/O sequence. When the ordering 708a8a602fSJonathan Corbetcan be compiler optimised, you can use __readb() and friends to 718a8a602fSJonathan Corbetindicate the relaxed ordering. Use this with care. 728a8a602fSJonathan Corbet 738a8a602fSJonathan CorbetWhile the basic functions are defined to be synchronous with respect to 748a8a602fSJonathan Corbeteach other and ordered with respect to each other the busses the devices 758a8a602fSJonathan Corbetsit on may themselves have asynchronicity. In particular many authors 768a8a602fSJonathan Corbetare burned by the fact that PCI bus writes are posted asynchronously. A 778a8a602fSJonathan Corbetdriver author must issue a read from the same device to ensure that 788a8a602fSJonathan Corbetwrites have occurred in the specific cases the author cares. This kind 798a8a602fSJonathan Corbetof property cannot be hidden from driver writers in the API. In some 808a8a602fSJonathan Corbetcases, the read used to flush the device may be expected to fail (if the 818a8a602fSJonathan Corbetcard is resetting, for example). In that case, the read should be done 828a8a602fSJonathan Corbetfrom config space, which is guaranteed to soft-fail if the card doesn't 838a8a602fSJonathan Corbetrespond. 848a8a602fSJonathan Corbet 858a8a602fSJonathan CorbetThe following is an example of flushing a write to a device when the 868a8a602fSJonathan Corbetdriver would like to ensure the write's effects are visible prior to 878a8a602fSJonathan Corbetcontinuing execution:: 888a8a602fSJonathan Corbet 898a8a602fSJonathan Corbet static inline void 908a8a602fSJonathan Corbet qla1280_disable_intrs(struct scsi_qla_host *ha) 918a8a602fSJonathan Corbet { 928a8a602fSJonathan Corbet struct device_reg *reg; 938a8a602fSJonathan Corbet 948a8a602fSJonathan Corbet reg = ha->iobase; 958a8a602fSJonathan Corbet /* disable risc and host interrupts */ 968a8a602fSJonathan Corbet WRT_REG_WORD(®->ictrl, 0); 978a8a602fSJonathan Corbet /* 988a8a602fSJonathan Corbet * The following read will ensure that the above write 998a8a602fSJonathan Corbet * has been received by the device before we return from this 1008a8a602fSJonathan Corbet * function. 1018a8a602fSJonathan Corbet */ 1028a8a602fSJonathan Corbet RD_REG_WORD(®->ictrl); 1038a8a602fSJonathan Corbet ha->flags.ints_enabled = 0; 1048a8a602fSJonathan Corbet } 1058a8a602fSJonathan Corbet 1068a8a602fSJonathan CorbetPCI ordering rules also guarantee that PIO read responses arrive after any 1078a8a602fSJonathan Corbetoutstanding DMA writes from that bus, since for some devices the result of 1088a8a602fSJonathan Corbeta readb() call may signal to the driver that a DMA transaction is 1098a8a602fSJonathan Corbetcomplete. In many cases, however, the driver may want to indicate that the 1108a8a602fSJonathan Corbetnext readb() call has no relation to any previous DMA writes 1118a8a602fSJonathan Corbetperformed by the device. The driver can use readb_relaxed() for 1128a8a602fSJonathan Corbetthese cases, although only some platforms will honor the relaxed 1138a8a602fSJonathan Corbetsemantics. Using the relaxed read functions will provide significant 1148a8a602fSJonathan Corbetperformance benefits on platforms that support it. The qla2xxx driver 1158a8a602fSJonathan Corbetprovides examples of how to use readX_relaxed(). In many cases, a majority 1168a8a602fSJonathan Corbetof the driver's readX() calls can safely be converted to readX_relaxed() 1178a8a602fSJonathan Corbetcalls, since only a few will indicate or depend on DMA completion. 1188a8a602fSJonathan Corbet 1198a8a602fSJonathan CorbetPort Space Accesses 1208a8a602fSJonathan Corbet=================== 1218a8a602fSJonathan Corbet 1228a8a602fSJonathan CorbetPort Space Explained 1238a8a602fSJonathan Corbet-------------------- 1248a8a602fSJonathan Corbet 1258a8a602fSJonathan CorbetAnother form of IO commonly supported is Port Space. This is a range of 1268a8a602fSJonathan Corbetaddresses separate to the normal memory address space. Access to these 1278a8a602fSJonathan Corbetaddresses is generally not as fast as accesses to the memory mapped 1288a8a602fSJonathan Corbetaddresses, and it also has a potentially smaller address space. 1298a8a602fSJonathan Corbet 1308a8a602fSJonathan CorbetUnlike memory mapped IO, no preparation is required to access port 1318a8a602fSJonathan Corbetspace. 1328a8a602fSJonathan Corbet 1338a8a602fSJonathan CorbetAccessing Port Space 1348a8a602fSJonathan Corbet-------------------- 1358a8a602fSJonathan Corbet 1368a8a602fSJonathan CorbetAccesses to this space are provided through a set of functions which 1378a8a602fSJonathan Corbetallow 8-bit, 16-bit and 32-bit accesses; also known as byte, word and 1383a63e9fcSPuranjay Mohanlong. These functions are inb(), inw(), 1393a63e9fcSPuranjay Mohaninl(), outb(), outw() and 1403a63e9fcSPuranjay Mohanoutl(). 1418a8a602fSJonathan Corbet 1428a8a602fSJonathan CorbetSome variants are provided for these functions. Some devices require 1438a8a602fSJonathan Corbetthat accesses to their ports are slowed down. This functionality is 1448a8a602fSJonathan Corbetprovided by appending a ``_p`` to the end of the function. 1453a63e9fcSPuranjay MohanThere are also equivalents to memcpy. The ins() and 1463a63e9fcSPuranjay Mohanouts() functions copy bytes, words or longs to the given 1478a8a602fSJonathan Corbetport. 1488a8a602fSJonathan Corbet 149eeba4b01SArnd Bergmann__iomem pointer tokens 150eeba4b01SArnd Bergmann====================== 151eeba4b01SArnd Bergmann 152eeba4b01SArnd BergmannThe data type for an MMIO address is an ``__iomem`` qualified pointer, such as 153eeba4b01SArnd Bergmann``void __iomem *reg``. On most architectures it is a regular pointer that 154eeba4b01SArnd Bergmannpoints to a virtual memory address and can be offset or dereferenced, but in 155eeba4b01SArnd Bergmannportable code, it must only be passed from and to functions that explicitly 156eeba4b01SArnd Bergmannoperated on an ``__iomem`` token, in particular the ioremap() and 157eeba4b01SArnd Bergmannreadl()/writel() functions. The 'sparse' semantic code checker can be used to 158eeba4b01SArnd Bergmannverify that this is done correctly. 159eeba4b01SArnd Bergmann 160eeba4b01SArnd BergmannWhile on most architectures, ioremap() creates a page table entry for an 161eeba4b01SArnd Bergmannuncached virtual address pointing to the physical MMIO address, some 162eeba4b01SArnd Bergmannarchitectures require special instructions for MMIO, and the ``__iomem`` pointer 163eeba4b01SArnd Bergmannjust encodes the physical address or an offsettable cookie that is interpreted 164eeba4b01SArnd Bergmannby readl()/writel(). 165eeba4b01SArnd Bergmann 166eeba4b01SArnd BergmannDifferences between I/O access functions 167eeba4b01SArnd Bergmann======================================== 168eeba4b01SArnd Bergmann 169eeba4b01SArnd Bergmannreadq(), readl(), readw(), readb(), writeq(), writel(), writew(), writeb() 170eeba4b01SArnd Bergmann 171eeba4b01SArnd Bergmann These are the most generic accessors, providing serialization against other 172eeba4b01SArnd Bergmann MMIO accesses and DMA accesses as well as fixed endianness for accessing 173eeba4b01SArnd Bergmann little-endian PCI devices and on-chip peripherals. Portable device drivers 174eeba4b01SArnd Bergmann should generally use these for any access to ``__iomem`` pointers. 175eeba4b01SArnd Bergmann 176eeba4b01SArnd Bergmann Note that posted writes are not strictly ordered against a spinlock, see 177eeba4b01SArnd Bergmann Documentation/driver-api/io_ordering.rst. 178eeba4b01SArnd Bergmann 179eeba4b01SArnd Bergmannreadq_relaxed(), readl_relaxed(), readw_relaxed(), readb_relaxed(), 180eeba4b01SArnd Bergmannwriteq_relaxed(), writel_relaxed(), writew_relaxed(), writeb_relaxed() 181eeba4b01SArnd Bergmann 182eeba4b01SArnd Bergmann On architectures that require an expensive barrier for serializing against 183eeba4b01SArnd Bergmann DMA, these "relaxed" versions of the MMIO accessors only serialize against 184eeba4b01SArnd Bergmann each other, but contain a less expensive barrier operation. A device driver 185eeba4b01SArnd Bergmann might use these in a particularly performance sensitive fast path, with a 186eeba4b01SArnd Bergmann comment that explains why the usage in a specific location is safe without 187eeba4b01SArnd Bergmann the extra barriers. 188eeba4b01SArnd Bergmann 189eeba4b01SArnd Bergmann See memory-barriers.txt for a more detailed discussion on the precise ordering 190eeba4b01SArnd Bergmann guarantees of the non-relaxed and relaxed versions. 191eeba4b01SArnd Bergmann 192eeba4b01SArnd Bergmannioread64(), ioread32(), ioread16(), ioread8(), 193eeba4b01SArnd Bergmanniowrite64(), iowrite32(), iowrite16(), iowrite8() 194eeba4b01SArnd Bergmann 195eeba4b01SArnd Bergmann These are an alternative to the normal readl()/writel() functions, with almost 196eeba4b01SArnd Bergmann identical behavior, but they can also operate on ``__iomem`` tokens returned 197eeba4b01SArnd Bergmann for mapping PCI I/O space with pci_iomap() or ioport_map(). On architectures 198eeba4b01SArnd Bergmann that require special instructions for I/O port access, this adds a small 199eeba4b01SArnd Bergmann overhead for an indirect function call implemented in lib/iomap.c, while on 200eeba4b01SArnd Bergmann other architectures, these are simply aliases. 201eeba4b01SArnd Bergmann 202eeba4b01SArnd Bergmannioread64be(), ioread32be(), ioread16be() 203eeba4b01SArnd Bergmanniowrite64be(), iowrite32be(), iowrite16be() 204eeba4b01SArnd Bergmann 205eeba4b01SArnd Bergmann These behave in the same way as the ioread32()/iowrite32() family, but with 206eeba4b01SArnd Bergmann reversed byte order, for accessing devices with big-endian MMIO registers. 207eeba4b01SArnd Bergmann Device drivers that can operate on either big-endian or little-endian 208eeba4b01SArnd Bergmann registers may have to implement a custom wrapper function that picks one or 209eeba4b01SArnd Bergmann the other depending on which device was found. 210eeba4b01SArnd Bergmann 211eeba4b01SArnd Bergmann Note: On some architectures, the normal readl()/writel() functions 212eeba4b01SArnd Bergmann traditionally assume that devices are the same endianness as the CPU, while 213eeba4b01SArnd Bergmann using a hardware byte-reverse on the PCI bus when running a big-endian kernel. 214eeba4b01SArnd Bergmann Drivers that use readl()/writel() this way are generally not portable, but 215eeba4b01SArnd Bergmann tend to be limited to a particular SoC. 216eeba4b01SArnd Bergmann 217eeba4b01SArnd Bergmannhi_lo_readq(), lo_hi_readq(), hi_lo_readq_relaxed(), lo_hi_readq_relaxed(), 218eeba4b01SArnd Bergmannioread64_lo_hi(), ioread64_hi_lo(), ioread64be_lo_hi(), ioread64be_hi_lo(), 219eeba4b01SArnd Bergmannhi_lo_writeq(), lo_hi_writeq(), hi_lo_writeq_relaxed(), lo_hi_writeq_relaxed(), 220eeba4b01SArnd Bergmanniowrite64_lo_hi(), iowrite64_hi_lo(), iowrite64be_lo_hi(), iowrite64be_hi_lo() 221eeba4b01SArnd Bergmann 222eeba4b01SArnd Bergmann Some device drivers have 64-bit registers that cannot be accessed atomically 223eeba4b01SArnd Bergmann on 32-bit architectures but allow two consecutive 32-bit accesses instead. 224eeba4b01SArnd Bergmann Since it depends on the particular device which of the two halves has to be 225eeba4b01SArnd Bergmann accessed first, a helper is provided for each combination of 64-bit accessors 226eeba4b01SArnd Bergmann with either low/high or high/low word ordering. A device driver must include 227eeba4b01SArnd Bergmann either <linux/io-64-nonatomic-lo-hi.h> or <linux/io-64-nonatomic-hi-lo.h> to 228eeba4b01SArnd Bergmann get the function definitions along with helpers that redirect the normal 229eeba4b01SArnd Bergmann readq()/writeq() to them on architectures that do not provide 64-bit access 230eeba4b01SArnd Bergmann natively. 231eeba4b01SArnd Bergmann 232eeba4b01SArnd Bergmann__raw_readq(), __raw_readl(), __raw_readw(), __raw_readb(), 233eeba4b01SArnd Bergmann__raw_writeq(), __raw_writel(), __raw_writew(), __raw_writeb() 234eeba4b01SArnd Bergmann 235eeba4b01SArnd Bergmann These are low-level MMIO accessors without barriers or byteorder changes and 236eeba4b01SArnd Bergmann architecture specific behavior. Accesses are usually atomic in the sense that 237eeba4b01SArnd Bergmann a four-byte __raw_readl() does not get split into individual byte loads, but 238eeba4b01SArnd Bergmann multiple consecutive accesses can be combined on the bus. In portable code, it 239eeba4b01SArnd Bergmann is only safe to use these to access memory behind a device bus but not MMIO 240eeba4b01SArnd Bergmann registers, as there are no ordering guarantees with regard to other MMIO 241eeba4b01SArnd Bergmann accesses or even spinlocks. The byte order is generally the same as for normal 242eeba4b01SArnd Bergmann memory, so unlike the other functions, these can be used to copy data between 243eeba4b01SArnd Bergmann kernel memory and device memory. 244eeba4b01SArnd Bergmann 245eeba4b01SArnd Bergmanninl(), inw(), inb(), outl(), outw(), outb() 246eeba4b01SArnd Bergmann 247eeba4b01SArnd Bergmann PCI I/O port resources traditionally require separate helpers as they are 248eeba4b01SArnd Bergmann implemented using special instructions on the x86 architecture. On most other 249eeba4b01SArnd Bergmann architectures, these are mapped to readl()/writel() style accessors 250eeba4b01SArnd Bergmann internally, usually pointing to a fixed area in virtual memory. Instead of an 251eeba4b01SArnd Bergmann ``__iomem`` pointer, the address is a 32-bit integer token to identify a port 252eeba4b01SArnd Bergmann number. PCI requires I/O port access to be non-posted, meaning that an outb() 253eeba4b01SArnd Bergmann must complete before the following code executes, while a normal writeb() may 254eeba4b01SArnd Bergmann still be in progress. On architectures that correctly implement this, I/O port 255eeba4b01SArnd Bergmann access is therefore ordered against spinlocks. Many non-x86 PCI host bridge 256eeba4b01SArnd Bergmann implementations and CPU architectures however fail to implement non-posted I/O 257eeba4b01SArnd Bergmann space on PCI, so they can end up being posted on such hardware. 258eeba4b01SArnd Bergmann 259eeba4b01SArnd Bergmann In some architectures, the I/O port number space has a 1:1 mapping to 260eeba4b01SArnd Bergmann ``__iomem`` pointers, but this is not recommended and device drivers should 261eeba4b01SArnd Bergmann not rely on that for portability. Similarly, an I/O port number as described 262eeba4b01SArnd Bergmann in a PCI base address register may not correspond to the port number as seen 263eeba4b01SArnd Bergmann by a device driver. Portable drivers need to read the port number for the 264eeba4b01SArnd Bergmann resource provided by the kernel. 265eeba4b01SArnd Bergmann 266eeba4b01SArnd Bergmann There are no direct 64-bit I/O port accessors, but pci_iomap() in combination 267eeba4b01SArnd Bergmann with ioread64/iowrite64 can be used instead. 268eeba4b01SArnd Bergmann 269eeba4b01SArnd Bergmanninl_p(), inw_p(), inb_p(), outl_p(), outw_p(), outb_p() 270eeba4b01SArnd Bergmann 271eeba4b01SArnd Bergmann On ISA devices that require specific timing, the _p versions of the I/O 272eeba4b01SArnd Bergmann accessors add a small delay. On architectures that do not have ISA buses, 273eeba4b01SArnd Bergmann these are aliases to the normal inb/outb helpers. 274eeba4b01SArnd Bergmann 275eeba4b01SArnd Bergmannreadsq, readsl, readsw, readsb 276eeba4b01SArnd Bergmannwritesq, writesl, writesw, writesb 277eeba4b01SArnd Bergmannioread64_rep, ioread32_rep, ioread16_rep, ioread8_rep 278eeba4b01SArnd Bergmanniowrite64_rep, iowrite32_rep, iowrite16_rep, iowrite8_rep 279eeba4b01SArnd Bergmanninsl, insw, insb, outsl, outsw, outsb 280eeba4b01SArnd Bergmann 281eeba4b01SArnd Bergmann These are helpers that access the same address multiple times, usually to copy 282eeba4b01SArnd Bergmann data between kernel memory byte stream and a FIFO buffer. Unlike the normal 283eeba4b01SArnd Bergmann MMIO accessors, these do not perform a byteswap on big-endian kernels, so the 284eeba4b01SArnd Bergmann first byte in the FIFO register corresponds to the first byte in the memory 285eeba4b01SArnd Bergmann buffer regardless of the architecture. 286eeba4b01SArnd Bergmann 28708e9b5beSHector MartinDevice memory mapping modes 28808e9b5beSHector Martin=========================== 28908e9b5beSHector Martin 29008e9b5beSHector MartinSome architectures support multiple modes for mapping device memory. 29108e9b5beSHector Martinioremap_*() variants provide a common abstraction around these 29208e9b5beSHector Martinarchitecture-specific modes, with a shared set of semantics. 29308e9b5beSHector Martin 29408e9b5beSHector Martinioremap() is the most common mapping type, and is applicable to typical device 29508e9b5beSHector Martinmemory (e.g. I/O registers). Other modes can offer weaker or stronger 29608e9b5beSHector Martinguarantees, if supported by the architecture. From most to least common, they 29708e9b5beSHector Martinare as follows: 29808e9b5beSHector Martin 29908e9b5beSHector Martinioremap() 30008e9b5beSHector Martin--------- 30108e9b5beSHector Martin 30208e9b5beSHector MartinThe default mode, suitable for most memory-mapped devices, e.g. control 30308e9b5beSHector Martinregisters. Memory mapped using ioremap() has the following characteristics: 30408e9b5beSHector Martin 30508e9b5beSHector Martin* Uncached - CPU-side caches are bypassed, and all reads and writes are handled 30608e9b5beSHector Martin directly by the device 30708e9b5beSHector Martin* No speculative operations - the CPU may not issue a read or write to this 30808e9b5beSHector Martin memory, unless the instruction that does so has been reached in committed 30908e9b5beSHector Martin program flow. 31008e9b5beSHector Martin* No reordering - The CPU may not reorder accesses to this memory mapping with 31108e9b5beSHector Martin respect to each other. On some architectures, this relies on barriers in 31208e9b5beSHector Martin readl_relaxed()/writel_relaxed(). 31308e9b5beSHector Martin* No repetition - The CPU may not issue multiple reads or writes for a single 31408e9b5beSHector Martin program instruction. 31508e9b5beSHector Martin* No write-combining - Each I/O operation results in one discrete read or write 31608e9b5beSHector Martin being issued to the device, and multiple writes are not combined into larger 31708e9b5beSHector Martin writes. This may or may not be enforced when using __raw I/O accessors or 31808e9b5beSHector Martin pointer dereferences. 31908e9b5beSHector Martin* Non-executable - The CPU is not allowed to speculate instruction execution 32008e9b5beSHector Martin from this memory (it probably goes without saying, but you're also not 32108e9b5beSHector Martin allowed to jump into device memory). 32208e9b5beSHector Martin 32308e9b5beSHector MartinOn many platforms and buses (e.g. PCI), writes issued through ioremap() 32408e9b5beSHector Martinmappings are posted, which means that the CPU does not wait for the write to 32508e9b5beSHector Martinactually reach the target device before retiring the write instruction. 32608e9b5beSHector Martin 32708e9b5beSHector MartinOn many platforms, I/O accesses must be aligned with respect to the access 32808e9b5beSHector Martinsize; failure to do so will result in an exception or unpredictable results. 32908e9b5beSHector Martin 33008e9b5beSHector Martinioremap_wc() 33108e9b5beSHector Martin------------ 33208e9b5beSHector Martin 33308e9b5beSHector MartinMaps I/O memory as normal memory with write combining. Unlike ioremap(), 33408e9b5beSHector Martin 33508e9b5beSHector Martin* The CPU may speculatively issue reads from the device that the program 33608e9b5beSHector Martin didn't actually execute, and may choose to basically read whatever it wants. 33708e9b5beSHector Martin* The CPU may reorder operations as long as the result is consistent from the 33808e9b5beSHector Martin program's point of view. 33908e9b5beSHector Martin* The CPU may write to the same location multiple times, even when the program 34008e9b5beSHector Martin issued a single write. 34108e9b5beSHector Martin* The CPU may combine several writes into a single larger write. 34208e9b5beSHector Martin 34308e9b5beSHector MartinThis mode is typically used for video framebuffers, where it can increase 34408e9b5beSHector Martinperformance of writes. It can also be used for other blocks of memory in 34508e9b5beSHector Martindevices (e.g. buffers or shared memory), but care must be taken as accesses are 34608e9b5beSHector Martinnot guaranteed to be ordered with respect to normal ioremap() MMIO register 34708e9b5beSHector Martinaccesses without explicit barriers. 34808e9b5beSHector Martin 34908e9b5beSHector MartinOn a PCI bus, it is usually safe to use ioremap_wc() on MMIO areas marked as 35008e9b5beSHector Martin``IORESOURCE_PREFETCH``, but it may not be used on those without the flag. 35108e9b5beSHector MartinFor on-chip devices, there is no corresponding flag, but a driver can use 35208e9b5beSHector Martinioremap_wc() on a device that is known to be safe. 35308e9b5beSHector Martin 35408e9b5beSHector Martinioremap_wt() 35508e9b5beSHector Martin------------ 35608e9b5beSHector Martin 35708e9b5beSHector MartinMaps I/O memory as normal memory with write-through caching. Like ioremap_wc(), 35808e9b5beSHector Martinbut also, 35908e9b5beSHector Martin 36008e9b5beSHector Martin* The CPU may cache writes issued to and reads from the device, and serve reads 36108e9b5beSHector Martin from that cache. 36208e9b5beSHector Martin 36308e9b5beSHector MartinThis mode is sometimes used for video framebuffers, where drivers still expect 36408e9b5beSHector Martinwrites to reach the device in a timely manner (and not be stuck in the CPU 36508e9b5beSHector Martincache), but reads may be served from the cache for efficiency. However, it is 36608e9b5beSHector Martinrarely useful these days, as framebuffer drivers usually perform writes only, 36708e9b5beSHector Martinfor which ioremap_wc() is more efficient (as it doesn't needlessly trash the 36808e9b5beSHector Martincache). Most drivers should not use this. 36908e9b5beSHector Martin 37008e9b5beSHector Martinioremap_np() 37108e9b5beSHector Martin------------ 37208e9b5beSHector Martin 37308e9b5beSHector MartinLike ioremap(), but explicitly requests non-posted write semantics. On some 37408e9b5beSHector Martinarchitectures and buses, ioremap() mappings have posted write semantics, which 37508e9b5beSHector Martinmeans that writes can appear to "complete" from the point of view of the 37608e9b5beSHector MartinCPU before the written data actually arrives at the target device. Writes are 37708e9b5beSHector Martinstill ordered with respect to other writes and reads from the same device, but 37808e9b5beSHector Martindue to the posted write semantics, this is not the case with respect to other 37908e9b5beSHector Martindevices. ioremap_np() explicitly requests non-posted semantics, which means 38008e9b5beSHector Martinthat the write instruction will not appear to complete until the device has 38108e9b5beSHector Martinreceived (and to some platform-specific extent acknowledged) the written data. 38208e9b5beSHector Martin 38308e9b5beSHector MartinThis mapping mode primarily exists to cater for platforms with bus fabrics that 38408e9b5beSHector Martinrequire this particular mapping mode to work correctly. These platforms set the 38508e9b5beSHector Martin``IORESOURCE_MEM_NONPOSTED`` flag for a resource that requires ioremap_np() 38608e9b5beSHector Martinsemantics and portable drivers should use an abstraction that automatically 38708e9b5beSHector Martinselects it where appropriate (see the `Higher-level ioremap abstractions`_ 38808e9b5beSHector Martinsection below). 38908e9b5beSHector Martin 39008e9b5beSHector MartinThe bare ioremap_np() is only available on some architectures; on others, it 39108e9b5beSHector Martinalways returns NULL. Drivers should not normally use it, unless they are 39208e9b5beSHector Martinplatform-specific or they derive benefit from non-posted writes where 39308e9b5beSHector Martinsupported, and can fall back to ioremap() otherwise. The normal approach to 39408e9b5beSHector Martinensure posted write completion is to do a dummy read after a write as 39508e9b5beSHector Martinexplained in `Accessing the device`_, which works with ioremap() on all 39608e9b5beSHector Martinplatforms. 39708e9b5beSHector Martin 39808e9b5beSHector Martinioremap_np() should never be used for PCI drivers. PCI memory space writes are 39908e9b5beSHector Martinalways posted, even on architectures that otherwise implement ioremap_np(). 40008e9b5beSHector MartinUsing ioremap_np() for PCI BARs will at best result in posted write semantics, 40108e9b5beSHector Martinand at worst result in complete breakage. 40208e9b5beSHector Martin 40308e9b5beSHector MartinNote that non-posted write semantics are orthogonal to CPU-side ordering 40408e9b5beSHector Martinguarantees. A CPU may still choose to issue other reads or writes before a 40508e9b5beSHector Martinnon-posted write instruction retires. See the previous section on MMIO access 40608e9b5beSHector Martinfunctions for details on the CPU side of things. 40708e9b5beSHector Martin 40808e9b5beSHector Martinioremap_uc() 40908e9b5beSHector Martin------------ 41008e9b5beSHector Martin 41108e9b5beSHector Martinioremap_uc() behaves like ioremap() except that on the x86 architecture without 41208e9b5beSHector Martin'PAT' mode, it marks memory as uncached even when the MTRR has designated 413*ff61f079SJonathan Corbetit as cacheable, see Documentation/arch/x86/pat.rst. 41408e9b5beSHector Martin 41508e9b5beSHector MartinPortable drivers should avoid the use of ioremap_uc(). 41608e9b5beSHector Martin 41708e9b5beSHector Martinioremap_cache() 41808e9b5beSHector Martin--------------- 41908e9b5beSHector Martin 42008e9b5beSHector Martinioremap_cache() effectively maps I/O memory as normal RAM. CPU write-back 42108e9b5beSHector Martincaches can be used, and the CPU is free to treat the device as if it were a 42208e9b5beSHector Martinblock of RAM. This should never be used for device memory which has side 42308e9b5beSHector Martineffects of any kind, or which does not return the data previously written on 42408e9b5beSHector Martinread. 42508e9b5beSHector Martin 42608e9b5beSHector MartinIt should also not be used for actual RAM, as the returned pointer is an 42708e9b5beSHector Martin``__iomem`` token. memremap() can be used for mapping normal RAM that is outside 42808e9b5beSHector Martinof the linear kernel memory area to a regular pointer. 42908e9b5beSHector Martin 43008e9b5beSHector MartinPortable drivers should avoid the use of ioremap_cache(). 43108e9b5beSHector Martin 43208e9b5beSHector MartinArchitecture example 43308e9b5beSHector Martin-------------------- 43408e9b5beSHector Martin 43508e9b5beSHector MartinHere is how the above modes map to memory attribute settings on the ARM64 43608e9b5beSHector Martinarchitecture: 43708e9b5beSHector Martin 43808e9b5beSHector Martin+------------------------+--------------------------------------------+ 43908e9b5beSHector Martin| API | Memory region type and cacheability | 44008e9b5beSHector Martin+------------------------+--------------------------------------------+ 44108e9b5beSHector Martin| ioremap_np() | Device-nGnRnE | 44208e9b5beSHector Martin+------------------------+--------------------------------------------+ 44308e9b5beSHector Martin| ioremap() | Device-nGnRE | 44408e9b5beSHector Martin+------------------------+--------------------------------------------+ 44508e9b5beSHector Martin| ioremap_uc() | (not implemented) | 44608e9b5beSHector Martin+------------------------+--------------------------------------------+ 44708e9b5beSHector Martin| ioremap_wc() | Normal-Non Cacheable | 44808e9b5beSHector Martin+------------------------+--------------------------------------------+ 44908e9b5beSHector Martin| ioremap_wt() | (not implemented; fallback to ioremap) | 45008e9b5beSHector Martin+------------------------+--------------------------------------------+ 45108e9b5beSHector Martin| ioremap_cache() | Normal-Write-Back Cacheable | 45208e9b5beSHector Martin+------------------------+--------------------------------------------+ 45308e9b5beSHector Martin 45408e9b5beSHector MartinHigher-level ioremap abstractions 45508e9b5beSHector Martin================================= 45608e9b5beSHector Martin 45708e9b5beSHector MartinInstead of using the above raw ioremap() modes, drivers are encouraged to use 45808e9b5beSHector Martinhigher-level APIs. These APIs may implement platform-specific logic to 45908e9b5beSHector Martinautomatically choose an appropriate ioremap mode on any given bus, allowing for 46008e9b5beSHector Martina platform-agnostic driver to work on those platforms without any special 46108e9b5beSHector Martincases. At the time of this writing, the following ioremap() wrappers have such 46208e9b5beSHector Martinlogic: 46308e9b5beSHector Martin 46408e9b5beSHector Martindevm_ioremap_resource() 46508e9b5beSHector Martin 46608e9b5beSHector Martin Can automatically select ioremap_np() over ioremap() according to platform 46708e9b5beSHector Martin requirements, if the ``IORESOURCE_MEM_NONPOSTED`` flag is set on the struct 46808e9b5beSHector Martin resource. Uses devres to automatically unmap the resource when the driver 46908e9b5beSHector Martin probe() function fails or a device in unbound from its driver. 47008e9b5beSHector Martin 47108e9b5beSHector Martin Documented in Documentation/driver-api/driver-model/devres.rst. 47208e9b5beSHector Martin 47308e9b5beSHector Martinof_address_to_resource() 47408e9b5beSHector Martin 47508e9b5beSHector Martin Automatically sets the ``IORESOURCE_MEM_NONPOSTED`` flag for platforms that 47608e9b5beSHector Martin require non-posted writes for certain buses (see the nonposted-mmio and 47708e9b5beSHector Martin posted-mmio device tree properties). 47808e9b5beSHector Martin 47908e9b5beSHector Martinof_iomap() 48008e9b5beSHector Martin 48108e9b5beSHector Martin Maps the resource described in a ``reg`` property in the device tree, doing 48208e9b5beSHector Martin all required translations. Automatically selects ioremap_np() according to 48308e9b5beSHector Martin platform requirements, as above. 48408e9b5beSHector Martin 48508e9b5beSHector Martinpci_ioremap_bar(), pci_ioremap_wc_bar() 48608e9b5beSHector Martin 48708e9b5beSHector Martin Maps the resource described in a PCI base address without having to extract 48808e9b5beSHector Martin the physical address first. 48908e9b5beSHector Martin 49008e9b5beSHector Martinpci_iomap(), pci_iomap_wc() 49108e9b5beSHector Martin 49208e9b5beSHector Martin Like pci_ioremap_bar()/pci_ioremap_bar(), but also works on I/O space when 49308e9b5beSHector Martin used together with ioread32()/iowrite32() and similar accessors 49408e9b5beSHector Martin 49508e9b5beSHector Martinpcim_iomap() 49608e9b5beSHector Martin 49708e9b5beSHector Martin Like pci_iomap(), but uses devres to automatically unmap the resource when 49808e9b5beSHector Martin the driver probe() function fails or a device in unbound from its driver 49908e9b5beSHector Martin 50008e9b5beSHector Martin Documented in Documentation/driver-api/driver-model/devres.rst. 50108e9b5beSHector Martin 50208e9b5beSHector MartinNot using these wrappers may make drivers unusable on certain platforms with 50308e9b5beSHector Martinstricter rules for mapping I/O memory. 50408e9b5beSHector Martin 5057938f421SLucas De MarchiGeneralizing Access to System and I/O Memory 5067938f421SLucas De Marchi============================================ 5077938f421SLucas De Marchi 5087938f421SLucas De Marchi.. kernel-doc:: include/linux/iosys-map.h 5097938f421SLucas De Marchi :doc: overview 5107938f421SLucas De Marchi 5117938f421SLucas De Marchi.. kernel-doc:: include/linux/iosys-map.h 5127938f421SLucas De Marchi :internal: 5137938f421SLucas De Marchi 5148a8a602fSJonathan CorbetPublic Functions Provided 5158a8a602fSJonathan Corbet========================= 5168a8a602fSJonathan Corbet 5178a8a602fSJonathan Corbet.. kernel-doc:: arch/x86/include/asm/io.h 5188a8a602fSJonathan Corbet :internal: 5198a8a602fSJonathan Corbet 5208a8a602fSJonathan Corbet.. kernel-doc:: lib/pci_iomap.c 5218a8a602fSJonathan Corbet :export: 522