xref: /openbmc/linux/Documentation/mm/remap_file_pages.rst (revision 9a87ffc99ec8eb8d35eed7c4f816d75f5cc9662e)
1*ee65728eSMike Rapoport==============================
2*ee65728eSMike Rapoportremap_file_pages() system call
3*ee65728eSMike Rapoport==============================
4*ee65728eSMike Rapoport
5*ee65728eSMike RapoportThe remap_file_pages() system call is used to create a nonlinear mapping,
6*ee65728eSMike Rapoportthat is, a mapping in which the pages of the file are mapped into a
7*ee65728eSMike Rapoportnonsequential order in memory. The advantage of using remap_file_pages()
8*ee65728eSMike Rapoportover using repeated calls to mmap(2) is that the former approach does not
9*ee65728eSMike Rapoportrequire the kernel to create additional VMA (Virtual Memory Area) data
10*ee65728eSMike Rapoportstructures.
11*ee65728eSMike Rapoport
12*ee65728eSMike RapoportSupporting of nonlinear mapping requires significant amount of non-trivial
13*ee65728eSMike Rapoportcode in kernel virtual memory subsystem including hot paths. Also to get
14*ee65728eSMike Rapoportnonlinear mapping work kernel need a way to distinguish normal page table
15*ee65728eSMike Rapoportentries from entries with file offset (pte_file). Kernel reserves flag in
16*ee65728eSMike RapoportPTE for this purpose. PTE flags are scarce resource especially on some CPU
17*ee65728eSMike Rapoportarchitectures. It would be nice to free up the flag for other usage.
18*ee65728eSMike Rapoport
19*ee65728eSMike RapoportFortunately, there are not many users of remap_file_pages() in the wild.
20*ee65728eSMike RapoportIt's only known that one enterprise RDBMS implementation uses the syscall
21*ee65728eSMike Rapoporton 32-bit systems to map files bigger than can linearly fit into 32-bit
22*ee65728eSMike Rapoportvirtual address space. This use-case is not critical anymore since 64-bit
23*ee65728eSMike Rapoportsystems are widely available.
24*ee65728eSMike Rapoport
25*ee65728eSMike RapoportThe syscall is deprecated and replaced it with an emulation now. The
26*ee65728eSMike Rapoportemulation creates new VMAs instead of nonlinear mappings. It's going to
27*ee65728eSMike Rapoportwork slower for rare users of remap_file_pages() but ABI is preserved.
28*ee65728eSMike Rapoport
29*ee65728eSMike RapoportOne side effect of emulation (apart from performance) is that user can hit
30*ee65728eSMike Rapoportvm.max_map_count limit more easily due to additional VMAs. See comment for
31*ee65728eSMike RapoportDEFAULT_MAX_MAP_COUNT for more details on the limit.
32