History log of /openbmc/linux/drivers/iommu/iommufd/io_pagetable.c (Results 1 – 11 of 11)
Revision (<<< Hide revision tags) (Show revision tags >>>) Date Author Comments
Revision tags: v6.6.25, v6.6.24, v6.6.23
# f1fb745e 22-Feb-2024 Nicolin Chen <nicolinc@nvidia.com>

iommufd: Fix iopt_access_list_id overwrite bug

commit aeb004c0cd6958e910123a1607634401009c9539 upstream.

Syzkaller reported the following WARN_ON:
WARNING: CPU: 1 PID: 4738 at drivers/iommu/iommu

iommufd: Fix iopt_access_list_id overwrite bug

commit aeb004c0cd6958e910123a1607634401009c9539 upstream.

Syzkaller reported the following WARN_ON:
WARNING: CPU: 1 PID: 4738 at drivers/iommu/iommufd/io_pagetable.c:1360

Call Trace:
iommufd_access_change_ioas+0x2fe/0x4e0
iommufd_access_destroy_object+0x50/0xb0
iommufd_object_remove+0x2a3/0x490
iommufd_object_destroy_user
iommufd_access_destroy+0x71/0xb0
iommufd_test_staccess_release+0x89/0xd0
__fput+0x272/0xb50
__fput_sync+0x4b/0x60
__do_sys_close
__se_sys_close
__x64_sys_close+0x8b/0x110
do_syscall_x64

The mismatch between the access pointer in the list and the passed-in
pointer is resulting from an overwrite of access->iopt_access_list_id, in
iopt_add_access(). Called from iommufd_access_change_ioas() when
xa_alloc() succeeds but iopt_calculate_iova_alignment() fails.

Add a new_id in iopt_add_access() and only update iopt_access_list_id when
returning successfully.

Cc: stable@vger.kernel.org
Fixes: 9227da7816dd ("iommufd: Add iommufd_access_change_ioas(_id) helpers")
Link: https://lore.kernel.org/r/2dda7acb25b8562ec5f1310de828ef5da9ef509c.1708636627.git.nicolinc@nvidia.com
Reported-by: Jason Gunthorpe <jgg@nvidia.com>
Suggested-by: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

show more ...


Revision tags: v6.6.16, v6.6.15, v6.6.14, v6.6.13, v6.6.12, v6.6.11, v6.6.10, v6.6.9, v6.6.8, v6.6.7, v6.6.6, v6.6.5, v6.6.4, v6.6.3, v6.6.2, v6.5.11, v6.6.1, v6.5.10, v6.6
# fcb32111 27-Oct-2023 Koichiro Den <den@valinux.co.jp>

iommufd: Fix missing update of domains_itree after splitting iopt_area

commit e7250ab7ca4998fe026f2149805b03e09dc32498 upstream.

In iopt_area_split(), if the original iopt_area has filled a domain

iommufd: Fix missing update of domains_itree after splitting iopt_area

commit e7250ab7ca4998fe026f2149805b03e09dc32498 upstream.

In iopt_area_split(), if the original iopt_area has filled a domain and is
linked to domains_itree, pages_nodes have to be properly
reinserted. Otherwise the domains_itree becomes corrupted and we will UAF.

Fixes: 51fe6141f0f6 ("iommufd: Data structure to provide IOVA to PFN mapping")
Link: https://lore.kernel.org/r/20231027162941.2864615-2-den@valinux.co.jp
Cc: stable@vger.kernel.org
Signed-off-by: Koichiro Den <den@valinux.co.jp>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

show more ...


# 1baf6c5f 30-Oct-2023 Jason Gunthorpe <jgg@nvidia.com>

iommufd: Add iopt_area_alloc()

[ Upstream commit 361d744ddd61de065fbeb042aaed590d32dd92ec ]

We never initialize the two interval tree nodes, and zero fill is not the
same as RB_CLEAR_NODE. This can

iommufd: Add iopt_area_alloc()

[ Upstream commit 361d744ddd61de065fbeb042aaed590d32dd92ec ]

We never initialize the two interval tree nodes, and zero fill is not the
same as RB_CLEAR_NODE. This can hide issues where we missed adding the
area to the trees. Factor out the allocation and clear the two nodes.

Fixes: 51fe6141f0f6 ("iommufd: Data structure to provide IOVA to PFN mapping")
Link: https://lore.kernel.org/r/20231030145035.GG691768@ziepe.ca
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>

show more ...


Revision tags: v6.5.9, v6.5.8, v6.5.7, v6.5.6, v6.5.5, v6.5.4, v6.5.3, v6.5.2, v6.1.51, v6.5.1, v6.1.50, v6.5, v6.1.49, v6.1.48, v6.1.46, v6.1.45, v6.1.44, v6.1.43
# 5d5c85ff 28-Jul-2023 Nicolin Chen <nicolinc@nvidia.com>

iommufd: Allow passing in iopt_access_list_id to iopt_remove_access()

This is a preparatory change for ioas replacement support for accesses.
The replacement routine does an iopt_add_access() for a

iommufd: Allow passing in iopt_access_list_id to iopt_remove_access()

This is a preparatory change for ioas replacement support for accesses.
The replacement routine does an iopt_add_access() for a new IOAS first and
then iopt_remove_access() for the old IOAS upon the success of the first
call. However, the first call overrides the iopt_access_list_id in the
access struct, resulting in iopt_remove_access() being unable to work on
the old IOAS.

Add an iopt_access_list_id as a parameter to iopt_remove_access, so the
replacement routine can save the id before it gets overwritten. Pass the
id in iopt_remove_access() for a proper cleanup.

The existing callers should just pass in access->iopt_access_list_id.

Link: https://lore.kernel.org/r/7bb939b9e0102da0c099572bb3de78ab7622221e.1690523699.git.nicolinc@nvidia.com
Suggested-by: Jason Gunthorpe <jgg@nvidia.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>

show more ...


Revision tags: v6.1.42, v6.1.41, v6.1.40, v6.1.39
# 6583c865 17-Jul-2023 Jason Gunthorpe <jgg@nvidia.com>

iommufd/selftest: Add a selftest for IOMMU_HWPT_ALLOC

Test the basic flow.

Link: https://lore.kernel.org/r/19-v8-6659224517ea+532-iommufd_alloc_jgg@nvidia.com
Reviewed-by: Kevin Tian <kevin.tian@in

iommufd/selftest: Add a selftest for IOMMU_HWPT_ALLOC

Test the basic flow.

Link: https://lore.kernel.org/r/19-v8-6659224517ea+532-iommufd_alloc_jgg@nvidia.com
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Tested-by: Nicolin Chen <nicolinc@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>

show more ...


# 34f327a9 17-Jul-2023 Jason Gunthorpe <jgg@nvidia.com>

iommufd: Keep track of each device's reserved regions instead of groups

The driver facing API in the iommu core makes the reserved regions
per-device. An algorithm in the core code consolidates the

iommufd: Keep track of each device's reserved regions instead of groups

The driver facing API in the iommu core makes the reserved regions
per-device. An algorithm in the core code consolidates the regions of all
the devices in a group to return the group view.

To allow for devices to be hotplugged into the group iommufd would re-load
the entire group's reserved regions for each device, just in case they
changed.

Further iommufd already has to deal with duplicated/overlapping reserved
regions as it must union all the groups together.

Thus simplify all of this to just use the device reserved regions
interface directly from the iommu driver.

Link: https://lore.kernel.org/r/5-v8-6659224517ea+532-iommufd_alloc_jgg@nvidia.com
Suggested-by: Kevin Tian <kevin.tian@intel.com>
Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Tested-by: Nicolin Chen <nicolinc@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>

show more ...


Revision tags: v6.1.38, v6.1.37, v6.1.36, v6.4, v6.1.35
# 804ca14d 20-Jun-2023 Jason Gunthorpe <jgg@nvidia.com>

iommufd: Do not access the area pointer after unlocking

A concurrent unmap can trigger freeing of the area pointers while we are
generating an unmapping notification for accesses.

syzkaller reports

iommufd: Do not access the area pointer after unlocking

A concurrent unmap can trigger freeing of the area pointers while we are
generating an unmapping notification for accesses.

syzkaller reports:

BUG: KASAN: slab-use-after-free in iopt_unmap_iova_range+0x5ba/0x5f0
Read of size 4 at addr ffff888075996184 by task syz-executor.2/31160

CPU: 1 PID: 31160 Comm: syz-executor.2 Not tainted 6.4.0-rc5-syzkaller-00313-g4c605260bc60 #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 05/25/2023
Call Trace:
<TASK>
dump_stack_lvl+0xd9/0x150
print_address_description.constprop.0+0x2c/0x3c0
kasan_report+0x11c/0x130
iopt_unmap_iova_range+0x5ba/0x5f0
iopt_unmap_all+0x27/0x50
iommufd_ioas_unmap+0x3d0/0x490
iommufd_fops_ioctl+0x317/0x4b0
__x64_sys_ioctl+0x197/0x210
do_syscall_64+0x39/0xb0
entry_SYSCALL_64_after_hwframe+0x63/0xcd
RIP: 0033:0x7f0812c8c169
Code: 28 00 00 00 75 05 48 83 c4 28 c3 e8 f1 19 00 00 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 b8 ff ff ff f7 d8 64 89 01 48
RSP: 002b:00007f0813914168 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
RAX: ffffffffffffffda RBX: 00007f0812dabf80 RCX: 00007f0812c8c169
RDX: 0000000020000100 RSI: 0000000000003b86 RDI: 0000000000000005
RBP: 00007f0812ce7ca1 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000
R13: 00007f0812ecfb1f R14: 00007f0813914300 R15: 0000000000022000
</TASK>

Allocated by task 31160:
kasan_save_stack+0x22/0x40
kasan_set_track+0x25/0x30
__kasan_kmalloc+0xa2/0xb0
iopt_alloc_area_pages+0x94/0x560
iopt_map_user_pages+0x205/0x4e0
iommufd_ioas_map+0x329/0x5f0
iommufd_fops_ioctl+0x317/0x4b0
__x64_sys_ioctl+0x197/0x210
do_syscall_64+0x39/0xb0
entry_SYSCALL_64_after_hwframe+0x63/0xcd

Freed by task 31161:
kasan_save_stack+0x22/0x40
kasan_set_track+0x25/0x30
kasan_save_free_info+0x2e/0x40
____kasan_slab_free+0x160/0x1c0
slab_free_freelist_hook+0x8b/0x1c0
__kmem_cache_free+0xaf/0x2d0
iopt_unmap_iova_range+0x288/0x5f0
iopt_unmap_all+0x27/0x50
iommufd_ioas_unmap+0x3d0/0x490
iommufd_fops_ioctl+0x317/0x4b0
__x64_sys_ioctl+0x197/0x210
do_syscall_64+0x39/0xb0
entry_SYSCALL_64_after_hwframe+0x63/0xcd

The buggy address belongs to the object at ffff888075996100
which belongs to the cache kmalloc-cg-192 of size 192
The buggy address is located 132 bytes inside of
freed 192-byte region [ffff888075996100, ffff8880759961c0)

The buggy address belongs to the physical page:
page:ffffea0001d66580 refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x75996
memcg:ffff88801f1c2701
flags: 0xfff00000000200(slab|node=0|zone=1|lastcpupid=0x7ff)
page_type: 0xffffffff()
raw: 00fff00000000200 ffff88801244ddc0 dead000000000122 0000000000000000
raw: 0000000000000000 0000000080100010 00000001ffffffff ffff88801f1c2701
page dumped because: kasan: bad access detected
page_owner tracks the page as allocated
page last allocated via order 0, migratetype Unmovable, gfp_mask 0x112cc0(GFP_USER|__GFP_NOWARN|__GFP_NORETRY), pid 31157, tgid 31154 (syz-executor.0), ts 1984547323469, free_ts 1983933451331
post_alloc_hook+0x2db/0x350
get_page_from_freelist+0xf41/0x2c00
__alloc_pages+0x1cb/0x4a0
alloc_pages+0x1aa/0x270
allocate_slab+0x25f/0x390
___slab_alloc+0xa91/0x1400
__slab_alloc.constprop.0+0x56/0xa0
__kmem_cache_alloc_node+0x136/0x320
kmalloc_trace+0x26/0xe0
iommufd_test+0x1328/0x2c20
iommufd_fops_ioctl+0x317/0x4b0
__x64_sys_ioctl+0x197/0x210
do_syscall_64+0x39/0xb0
entry_SYSCALL_64_after_hwframe+0x63/0xcd
page last free stack trace:
free_unref_page_prepare+0x62e/0xcb0
free_unref_page_list+0xe3/0xa70
release_pages+0xcd8/0x1380
tlb_batch_pages_flush+0xa8/0x1a0
tlb_finish_mmu+0x14b/0x7e0
exit_mmap+0x2b2/0x930
__mmput+0x128/0x4c0
mmput+0x60/0x70
do_exit+0x9b0/0x29b0
do_group_exit+0xd4/0x2a0
get_signal+0x2318/0x25b0
arch_do_signal_or_restart+0x79/0x5c0
exit_to_user_mode_prepare+0x11f/0x240
syscall_exit_to_user_mode+0x1d/0x50
do_syscall_64+0x46/0xb0
entry_SYSCALL_64_after_hwframe+0x63/0xcd

Precompute what is needed to call the access function and do not check the
area's num_accesses again as the pointer may not be valid anymore. Use a
counter instead.

Fixes: 51fe6141f0f6 ("iommufd: Data structure to provide IOVA to PFN mapping")
Link: https://lore.kernel.org/r/1-v2-9a03761d445d+54-iommufd_syz2_jgg@nvidia.com
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Reported-by: syzbot+1ad12d16afca0e7d2dde@syzkaller.appspotmail.com
Closes: https://lore.kernel.org/r/0000000000001d40fc05fe385332@google.com
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>

show more ...


Revision tags: v6.1.34, v6.1.33, v6.1.32, v6.1.31, v6.1.30, v6.1.29, v6.1.28, v6.1.27, v6.1.26, v6.3, v6.1.25, v6.1.24, v6.1.23, v6.1.22, v6.1.21, v6.1.20, v6.1.19, v6.1.18, v6.1.17, v6.1.16, v6.1.15, v6.1.14, v6.1.13, v6.2, v6.1.12, v6.1.11, v6.1.10, v6.1.9, v6.1.8, v6.1.7, v6.1.6, v6.1.5, v6.0.19, v6.0.18, v6.1.4, v6.1.3, v6.0.17, v6.1.2, v6.0.16, v6.1.1, v6.0.15, v6.0.14, v6.0.13, v6.1, v6.0.12
# d6c55c0a 07-Dec-2022 Jason Gunthorpe <jgg@nvidia.com>

iommufd: Change the order of MSI setup

Eric points out this is wrong for the rare case of someone using
allow_unsafe_interrupts on ARM. We always have to setup the MSI window in
the domain if the io

iommufd: Change the order of MSI setup

Eric points out this is wrong for the rare case of someone using
allow_unsafe_interrupts on ARM. We always have to setup the MSI window in
the domain if the iommu driver asks for it.

Move the iommu_get_msi_cookie() setup to the top of the function and
always do it, regardless of the security mode. Add checks to
iommufd_device_setup_msi() to ensure the driver is not doing something
incomprehensible. No current driver will set both a HW and SW MSI window,
or have more than one SW MSI window.

Fixes: e8d57210035b ("iommufd: Add kAPI toward external drivers for physical devices")
Link: https://lore.kernel.org/r/3-v1-0362a1a1c034+98-iommufd_fixes1_jgg@nvidia.com
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Reported-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>

show more ...


Revision tags: v6.0.11
# 52f52858 29-Nov-2022 Jason Gunthorpe <jgg@nvidia.com>

iommufd: Add additional invariant assertions

These are on performance paths so we protect them using the
CONFIG_IOMMUFD_TEST to not take a hit during normal operation.

These are useful when running

iommufd: Add additional invariant assertions

These are on performance paths so we protect them using the
CONFIG_IOMMUFD_TEST to not take a hit during normal operation.

These are useful when running the test suite and syzkaller to find data
structure inconsistencies early.

Link: https://lore.kernel.org/r/18-v6-a196d26f289e+11787-iommufd_jgg@nvidia.com
Tested-by: Yi Liu <yi.l.liu@intel.com>
Tested-by: Matthew Rosato <mjrosato@linux.ibm.com> # s390
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>

show more ...


# 8d40205f 29-Nov-2022 Jason Gunthorpe <jgg@nvidia.com>

iommufd: Add kAPI toward external drivers for kernel access

Kernel access is the mode that VFIO "mdevs" use. In this case there is no
struct device and no IOMMU connection. iommufd acts as a record

iommufd: Add kAPI toward external drivers for kernel access

Kernel access is the mode that VFIO "mdevs" use. In this case there is no
struct device and no IOMMU connection. iommufd acts as a record keeper for
accesses and returns the actual struct pages back to the caller to use
however they need. eg with kmap or the DMA API.

Each caller must create a struct iommufd_access with
iommufd_access_create(), similar to how iommufd_device_bind() works. Using
this struct the caller can access blocks of IOVA using
iommufd_access_pin_pages() or iommufd_access_rw().

Callers must provide a callback that immediately unpins any IOVA being
used within a range. This happens if userspace unmaps the IOVA under the
pin.

The implementation forwards the access requests directly to the iopt
infrastructure that manages the iopt_pages_access.

Link: https://lore.kernel.org/r/14-v6-a196d26f289e+11787-iommufd_jgg@nvidia.com
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Tested-by: Nicolin Chen <nicolinc@nvidia.com>
Tested-by: Yi Liu <yi.l.liu@intel.com>
Tested-by: Lixiao Yang <lixiao.yang@intel.com>
Tested-by: Matthew Rosato <mjrosato@linux.ibm.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>

show more ...


# 51fe6141 29-Nov-2022 Jason Gunthorpe <jgg@nvidia.com>

iommufd: Data structure to provide IOVA to PFN mapping

This is the remainder of the IOAS data structure. Provide an object called
an io_pagetable that is composed of iopt_areas pointing at iopt_page

iommufd: Data structure to provide IOVA to PFN mapping

This is the remainder of the IOAS data structure. Provide an object called
an io_pagetable that is composed of iopt_areas pointing at iopt_pages,
along with a list of iommu_domains that mirror the IOVA to PFN map.

At the top this is a simple interval tree of iopt_areas indicating the map
of IOVA to iopt_pages. An xarray keeps track of a list of domains. Based
on the attached domains there is a minimum alignment for areas (which may
be smaller than PAGE_SIZE), an interval tree of reserved IOVA that can't
be mapped and an IOVA of allowed IOVA that can always be mappable.

The concept of an 'access' refers to something like a VFIO mdev that is
accessing the IOVA and using a 'struct page *' for CPU based access.

Externally an API is provided that matches the requirements of the IOCTL
interface for map/unmap and domain attachment.

The API provides a 'copy' primitive to establish a new IOVA map in a
different IOAS from an existing mapping by re-using the iopt_pages. This
is the basic mechanism to provide single pinning.

This is designed to support a pre-registration flow where userspace would
setup an dummy IOAS with no domains, map in memory and then establish an
access to pin all PFNs into the xarray.

Copy can then be used to create new IOVA mappings in a different IOAS,
with iommu_domains attached. Upon copy the PFNs will be read out of the
xarray and mapped into the iommu_domains, avoiding any pin_user_pages()
overheads.

Link: https://lore.kernel.org/r/10-v6-a196d26f289e+11787-iommufd_jgg@nvidia.com
Tested-by: Nicolin Chen <nicolinc@nvidia.com>
Tested-by: Yi Liu <yi.l.liu@intel.com>
Tested-by: Lixiao Yang <lixiao.yang@intel.com>
Tested-by: Matthew Rosato <mjrosato@linux.ibm.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Signed-off-by: Yi Liu <yi.l.liu@intel.com>
Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>

show more ...