xref: /openbmc/linux/Documentation/arch/x86/pat.rst (revision 1ac731c529cd4d6adbce134754b51ff7d822b145)
1*ff61f079SJonathan Corbet.. SPDX-License-Identifier: GPL-2.0
2*ff61f079SJonathan Corbet
3*ff61f079SJonathan Corbet==========================
4*ff61f079SJonathan CorbetPAT (Page Attribute Table)
5*ff61f079SJonathan Corbet==========================
6*ff61f079SJonathan Corbet
7*ff61f079SJonathan Corbetx86 Page Attribute Table (PAT) allows for setting the memory attribute at the
8*ff61f079SJonathan Corbetpage level granularity. PAT is complementary to the MTRR settings which allows
9*ff61f079SJonathan Corbetfor setting of memory types over physical address ranges. However, PAT is
10*ff61f079SJonathan Corbetmore flexible than MTRR due to its capability to set attributes at page level
11*ff61f079SJonathan Corbetand also due to the fact that there are no hardware limitations on number of
12*ff61f079SJonathan Corbetsuch attribute settings allowed. Added flexibility comes with guidelines for
13*ff61f079SJonathan Corbetnot having memory type aliasing for the same physical memory with multiple
14*ff61f079SJonathan Corbetvirtual addresses.
15*ff61f079SJonathan Corbet
16*ff61f079SJonathan CorbetPAT allows for different types of memory attributes. The most commonly used
17*ff61f079SJonathan Corbetones that will be supported at this time are:
18*ff61f079SJonathan Corbet
19*ff61f079SJonathan Corbet===  ==============
20*ff61f079SJonathan CorbetWB   Write-back
21*ff61f079SJonathan CorbetUC   Uncached
22*ff61f079SJonathan CorbetWC   Write-combined
23*ff61f079SJonathan CorbetWT   Write-through
24*ff61f079SJonathan CorbetUC-  Uncached Minus
25*ff61f079SJonathan Corbet===  ==============
26*ff61f079SJonathan Corbet
27*ff61f079SJonathan Corbet
28*ff61f079SJonathan CorbetPAT APIs
29*ff61f079SJonathan Corbet========
30*ff61f079SJonathan Corbet
31*ff61f079SJonathan CorbetThere are many different APIs in the kernel that allows setting of memory
32*ff61f079SJonathan Corbetattributes at the page level. In order to avoid aliasing, these interfaces
33*ff61f079SJonathan Corbetshould be used thoughtfully. Below is a table of interfaces available,
34*ff61f079SJonathan Corbettheir intended usage and their memory attribute relationships. Internally,
35*ff61f079SJonathan Corbetthese APIs use a reserve_memtype()/free_memtype() interface on the physical
36*ff61f079SJonathan Corbetaddress range to avoid any aliasing.
37*ff61f079SJonathan Corbet
38*ff61f079SJonathan Corbet+------------------------+----------+--------------+------------------+
39*ff61f079SJonathan Corbet| API                    |    RAM   |  ACPI,...    |  Reserved/Holes  |
40*ff61f079SJonathan Corbet+------------------------+----------+--------------+------------------+
41*ff61f079SJonathan Corbet| ioremap                |    --    |    UC-       |       UC-        |
42*ff61f079SJonathan Corbet+------------------------+----------+--------------+------------------+
43*ff61f079SJonathan Corbet| ioremap_cache          |    --    |    WB        |       WB         |
44*ff61f079SJonathan Corbet+------------------------+----------+--------------+------------------+
45*ff61f079SJonathan Corbet| ioremap_uc             |    --    |    UC        |       UC         |
46*ff61f079SJonathan Corbet+------------------------+----------+--------------+------------------+
47*ff61f079SJonathan Corbet| ioremap_wc             |    --    |    --        |       WC         |
48*ff61f079SJonathan Corbet+------------------------+----------+--------------+------------------+
49*ff61f079SJonathan Corbet| ioremap_wt             |    --    |    --        |       WT         |
50*ff61f079SJonathan Corbet+------------------------+----------+--------------+------------------+
51*ff61f079SJonathan Corbet| set_memory_uc,         |    UC-   |    --        |       --         |
52*ff61f079SJonathan Corbet| set_memory_wb          |          |              |                  |
53*ff61f079SJonathan Corbet+------------------------+----------+--------------+------------------+
54*ff61f079SJonathan Corbet| set_memory_wc,         |    WC    |    --        |       --         |
55*ff61f079SJonathan Corbet| set_memory_wb          |          |              |                  |
56*ff61f079SJonathan Corbet+------------------------+----------+--------------+------------------+
57*ff61f079SJonathan Corbet| set_memory_wt,         |    WT    |    --        |       --         |
58*ff61f079SJonathan Corbet| set_memory_wb          |          |              |                  |
59*ff61f079SJonathan Corbet+------------------------+----------+--------------+------------------+
60*ff61f079SJonathan Corbet| pci sysfs resource     |    --    |    --        |       UC-        |
61*ff61f079SJonathan Corbet+------------------------+----------+--------------+------------------+
62*ff61f079SJonathan Corbet| pci sysfs resource_wc  |    --    |    --        |       WC         |
63*ff61f079SJonathan Corbet| is IORESOURCE_PREFETCH |          |              |                  |
64*ff61f079SJonathan Corbet+------------------------+----------+--------------+------------------+
65*ff61f079SJonathan Corbet| pci proc               |    --    |    --        |       UC-        |
66*ff61f079SJonathan Corbet| !PCIIOC_WRITE_COMBINE  |          |              |                  |
67*ff61f079SJonathan Corbet+------------------------+----------+--------------+------------------+
68*ff61f079SJonathan Corbet| pci proc               |    --    |    --        |       WC         |
69*ff61f079SJonathan Corbet| PCIIOC_WRITE_COMBINE   |          |              |                  |
70*ff61f079SJonathan Corbet+------------------------+----------+--------------+------------------+
71*ff61f079SJonathan Corbet| /dev/mem               |    --    |   WB/WC/UC-  |    WB/WC/UC-     |
72*ff61f079SJonathan Corbet| read-write             |          |              |                  |
73*ff61f079SJonathan Corbet+------------------------+----------+--------------+------------------+
74*ff61f079SJonathan Corbet| /dev/mem               |    --    |    UC-       |       UC-        |
75*ff61f079SJonathan Corbet| mmap SYNC flag         |          |              |                  |
76*ff61f079SJonathan Corbet+------------------------+----------+--------------+------------------+
77*ff61f079SJonathan Corbet| /dev/mem               |    --    |   WB/WC/UC-  |  WB/WC/UC-       |
78*ff61f079SJonathan Corbet| mmap !SYNC flag        |          |              |                  |
79*ff61f079SJonathan Corbet| and                    |          |(from existing|  (from existing  |
80*ff61f079SJonathan Corbet| any alias to this area |          |alias)        |  alias)          |
81*ff61f079SJonathan Corbet+------------------------+----------+--------------+------------------+
82*ff61f079SJonathan Corbet| /dev/mem               |    --    |    WB        |       WB         |
83*ff61f079SJonathan Corbet| mmap !SYNC flag        |          |              |                  |
84*ff61f079SJonathan Corbet| no alias to this area  |          |              |                  |
85*ff61f079SJonathan Corbet| and                    |          |              |                  |
86*ff61f079SJonathan Corbet| MTRR says WB           |          |              |                  |
87*ff61f079SJonathan Corbet+------------------------+----------+--------------+------------------+
88*ff61f079SJonathan Corbet| /dev/mem               |    --    |    --        |       UC-        |
89*ff61f079SJonathan Corbet| mmap !SYNC flag        |          |              |                  |
90*ff61f079SJonathan Corbet| no alias to this area  |          |              |                  |
91*ff61f079SJonathan Corbet| and                    |          |              |                  |
92*ff61f079SJonathan Corbet| MTRR says !WB          |          |              |                  |
93*ff61f079SJonathan Corbet+------------------------+----------+--------------+------------------+
94*ff61f079SJonathan Corbet
95*ff61f079SJonathan Corbet
96*ff61f079SJonathan CorbetAdvanced APIs for drivers
97*ff61f079SJonathan Corbet=========================
98*ff61f079SJonathan Corbet
99*ff61f079SJonathan CorbetA. Exporting pages to users with remap_pfn_range, io_remap_pfn_range,
100*ff61f079SJonathan Corbetvmf_insert_pfn.
101*ff61f079SJonathan Corbet
102*ff61f079SJonathan CorbetDrivers wanting to export some pages to userspace do it by using mmap
103*ff61f079SJonathan Corbetinterface and a combination of:
104*ff61f079SJonathan Corbet
105*ff61f079SJonathan Corbet  1) pgprot_noncached()
106*ff61f079SJonathan Corbet  2) io_remap_pfn_range() or remap_pfn_range() or vmf_insert_pfn()
107*ff61f079SJonathan Corbet
108*ff61f079SJonathan CorbetWith PAT support, a new API pgprot_writecombine is being added. So, drivers can
109*ff61f079SJonathan Corbetcontinue to use the above sequence, with either pgprot_noncached() or
110*ff61f079SJonathan Corbetpgprot_writecombine() in step 1, followed by step 2.
111*ff61f079SJonathan Corbet
112*ff61f079SJonathan CorbetIn addition, step 2 internally tracks the region as UC or WC in memtype
113*ff61f079SJonathan Corbetlist in order to ensure no conflicting mapping.
114*ff61f079SJonathan Corbet
115*ff61f079SJonathan CorbetNote that this set of APIs only works with IO (non RAM) regions. If driver
116*ff61f079SJonathan Corbetwants to export a RAM region, it has to do set_memory_uc() or set_memory_wc()
117*ff61f079SJonathan Corbetas step 0 above and also track the usage of those pages and use set_memory_wb()
118*ff61f079SJonathan Corbetbefore the page is freed to free pool.
119*ff61f079SJonathan Corbet
120*ff61f079SJonathan CorbetMTRR effects on PAT / non-PAT systems
121*ff61f079SJonathan Corbet=====================================
122*ff61f079SJonathan Corbet
123*ff61f079SJonathan CorbetThe following table provides the effects of using write-combining MTRRs when
124*ff61f079SJonathan Corbetusing ioremap*() calls on x86 for both non-PAT and PAT systems. Ideally
125*ff61f079SJonathan Corbetmtrr_add() usage will be phased out in favor of arch_phys_wc_add() which will
126*ff61f079SJonathan Corbetbe a no-op on PAT enabled systems. The region over which a arch_phys_wc_add()
127*ff61f079SJonathan Corbetis made, should already have been ioremapped with WC attributes or PAT entries,
128*ff61f079SJonathan Corbetthis can be done by using ioremap_wc() / set_memory_wc().  Devices which
129*ff61f079SJonathan Corbetcombine areas of IO memory desired to remain uncacheable with areas where
130*ff61f079SJonathan Corbetwrite-combining is desirable should consider use of ioremap_uc() followed by
131*ff61f079SJonathan Corbetset_memory_wc() to white-list effective write-combined areas.  Such use is
132*ff61f079SJonathan Corbetnevertheless discouraged as the effective memory type is considered
133*ff61f079SJonathan Corbetimplementation defined, yet this strategy can be used as last resort on devices
134*ff61f079SJonathan Corbetwith size-constrained regions where otherwise MTRR write-combining would
135*ff61f079SJonathan Corbetotherwise not be effective.
136*ff61f079SJonathan Corbet::
137*ff61f079SJonathan Corbet
138*ff61f079SJonathan Corbet  ====  =======  ===  =========================  =====================
139*ff61f079SJonathan Corbet  MTRR  Non-PAT  PAT  Linux ioremap value        Effective memory type
140*ff61f079SJonathan Corbet  ====  =======  ===  =========================  =====================
141*ff61f079SJonathan Corbet        PAT                                        Non-PAT |  PAT
142*ff61f079SJonathan Corbet        |PCD                                               |
143*ff61f079SJonathan Corbet        ||PWT                                              |
144*ff61f079SJonathan Corbet        |||                                                |
145*ff61f079SJonathan Corbet  WC    000      WB   _PAGE_CACHE_MODE_WB             WC   |   WC
146*ff61f079SJonathan Corbet  WC    001      WC   _PAGE_CACHE_MODE_WC             WC*  |   WC
147*ff61f079SJonathan Corbet  WC    010      UC-  _PAGE_CACHE_MODE_UC_MINUS       WC*  |   UC
148*ff61f079SJonathan Corbet  WC    011      UC   _PAGE_CACHE_MODE_UC             UC   |   UC
149*ff61f079SJonathan Corbet  ====  =======  ===  =========================  =====================
150*ff61f079SJonathan Corbet
151*ff61f079SJonathan Corbet  (*) denotes implementation defined and is discouraged
152*ff61f079SJonathan Corbet
153*ff61f079SJonathan Corbet.. note:: -- in the above table mean "Not suggested usage for the API". Some
154*ff61f079SJonathan Corbet  of the --'s are strictly enforced by the kernel. Some others are not really
155*ff61f079SJonathan Corbet  enforced today, but may be enforced in future.
156*ff61f079SJonathan Corbet
157*ff61f079SJonathan CorbetFor ioremap and pci access through /sys or /proc - The actual type returned
158*ff61f079SJonathan Corbetcan be more restrictive, in case of any existing aliasing for that address.
159*ff61f079SJonathan CorbetFor example: If there is an existing uncached mapping, a new ioremap_wc can
160*ff61f079SJonathan Corbetreturn uncached mapping in place of write-combine requested.
161*ff61f079SJonathan Corbet
162*ff61f079SJonathan Corbetset_memory_[uc|wc|wt] and set_memory_wb should be used in pairs, where driver
163*ff61f079SJonathan Corbetwill first make a region uc, wc or wt and switch it back to wb after use.
164*ff61f079SJonathan Corbet
165*ff61f079SJonathan CorbetOver time writes to /proc/mtrr will be deprecated in favor of using PAT based
166*ff61f079SJonathan Corbetinterfaces. Users writing to /proc/mtrr are suggested to use above interfaces.
167*ff61f079SJonathan Corbet
168*ff61f079SJonathan CorbetDrivers should use ioremap_[uc|wc] to access PCI BARs with [uc|wc] access
169*ff61f079SJonathan Corbettypes.
170*ff61f079SJonathan Corbet
171*ff61f079SJonathan CorbetDrivers should use set_memory_[uc|wc|wt] to set access type for RAM ranges.
172*ff61f079SJonathan Corbet
173*ff61f079SJonathan Corbet
174*ff61f079SJonathan CorbetPAT debugging
175*ff61f079SJonathan Corbet=============
176*ff61f079SJonathan Corbet
177*ff61f079SJonathan CorbetWith CONFIG_DEBUG_FS enabled, PAT memtype list can be examined by::
178*ff61f079SJonathan Corbet
179*ff61f079SJonathan Corbet  # mount -t debugfs debugfs /sys/kernel/debug
180*ff61f079SJonathan Corbet  # cat /sys/kernel/debug/x86/pat_memtype_list
181*ff61f079SJonathan Corbet  PAT memtype list:
182*ff61f079SJonathan Corbet  uncached-minus @ 0x7fadf000-0x7fae0000
183*ff61f079SJonathan Corbet  uncached-minus @ 0x7fb19000-0x7fb1a000
184*ff61f079SJonathan Corbet  uncached-minus @ 0x7fb1a000-0x7fb1b000
185*ff61f079SJonathan Corbet  uncached-minus @ 0x7fb1b000-0x7fb1c000
186*ff61f079SJonathan Corbet  uncached-minus @ 0x7fb1c000-0x7fb1d000
187*ff61f079SJonathan Corbet  uncached-minus @ 0x7fb1d000-0x7fb1e000
188*ff61f079SJonathan Corbet  uncached-minus @ 0x7fb1e000-0x7fb25000
189*ff61f079SJonathan Corbet  uncached-minus @ 0x7fb25000-0x7fb26000
190*ff61f079SJonathan Corbet  uncached-minus @ 0x7fb26000-0x7fb27000
191*ff61f079SJonathan Corbet  uncached-minus @ 0x7fb27000-0x7fb28000
192*ff61f079SJonathan Corbet  uncached-minus @ 0x7fb28000-0x7fb2e000
193*ff61f079SJonathan Corbet  uncached-minus @ 0x7fb2e000-0x7fb2f000
194*ff61f079SJonathan Corbet  uncached-minus @ 0x7fb2f000-0x7fb30000
195*ff61f079SJonathan Corbet  uncached-minus @ 0x7fb31000-0x7fb32000
196*ff61f079SJonathan Corbet  uncached-minus @ 0x80000000-0x90000000
197*ff61f079SJonathan Corbet
198*ff61f079SJonathan CorbetThis list shows physical address ranges and various PAT settings used to
199*ff61f079SJonathan Corbetaccess those physical address ranges.
200*ff61f079SJonathan Corbet
201*ff61f079SJonathan CorbetAnother, more verbose way of getting PAT related debug messages is with
202*ff61f079SJonathan Corbet"debugpat" boot parameter. With this parameter, various debug messages are
203*ff61f079SJonathan Corbetprinted to dmesg log.
204*ff61f079SJonathan Corbet
205*ff61f079SJonathan CorbetPAT Initialization
206*ff61f079SJonathan Corbet==================
207*ff61f079SJonathan Corbet
208*ff61f079SJonathan CorbetThe following table describes how PAT is initialized under various
209*ff61f079SJonathan Corbetconfigurations. The PAT MSR must be updated by Linux in order to support WC
210*ff61f079SJonathan Corbetand WT attributes. Otherwise, the PAT MSR has the value programmed in it
211*ff61f079SJonathan Corbetby the firmware. Note, Xen enables WC attribute in the PAT MSR for guests.
212*ff61f079SJonathan Corbet
213*ff61f079SJonathan Corbet ==== ===== ==========================  =========  =======
214*ff61f079SJonathan Corbet MTRR PAT   Call Sequence               PAT State  PAT MSR
215*ff61f079SJonathan Corbet ==== ===== ==========================  =========  =======
216*ff61f079SJonathan Corbet E    E     MTRR -> PAT init            Enabled    OS
217*ff61f079SJonathan Corbet E    D     MTRR -> PAT init            Disabled    -
218*ff61f079SJonathan Corbet D    E     MTRR -> PAT disable         Disabled   BIOS
219*ff61f079SJonathan Corbet D    D     MTRR -> PAT disable         Disabled    -
220*ff61f079SJonathan Corbet -    np/E  PAT  -> PAT disable         Disabled   BIOS
221*ff61f079SJonathan Corbet -    np/D  PAT  -> PAT disable         Disabled    -
222*ff61f079SJonathan Corbet E    !P/E  MTRR -> PAT init            Disabled   BIOS
223*ff61f079SJonathan Corbet D    !P/E  MTRR -> PAT disable         Disabled   BIOS
224*ff61f079SJonathan Corbet !M   !P/E  MTRR stub -> PAT disable    Disabled   BIOS
225*ff61f079SJonathan Corbet ==== ===== ==========================  =========  =======
226*ff61f079SJonathan Corbet
227*ff61f079SJonathan Corbet  Legend
228*ff61f079SJonathan Corbet
229*ff61f079SJonathan Corbet ========= =======================================
230*ff61f079SJonathan Corbet E         Feature enabled in CPU
231*ff61f079SJonathan Corbet D	   Feature disabled/unsupported in CPU
232*ff61f079SJonathan Corbet np	   "nopat" boot option specified
233*ff61f079SJonathan Corbet !P	   CONFIG_X86_PAT option unset
234*ff61f079SJonathan Corbet !M	   CONFIG_MTRR option unset
235*ff61f079SJonathan Corbet Enabled   PAT state set to enabled
236*ff61f079SJonathan Corbet Disabled  PAT state set to disabled
237*ff61f079SJonathan Corbet OS        PAT initializes PAT MSR with OS setting
238*ff61f079SJonathan Corbet BIOS      PAT keeps PAT MSR with BIOS setting
239*ff61f079SJonathan Corbet ========= =======================================
240*ff61f079SJonathan Corbet
241