History log of /openbmc/linux/drivers/iommu/iommufd/iommufd_private.h (Results 1 – 25 of 29)
Revision (<<< Hide revision tags) (Show revision tags >>>) Date Author Comments
Revision tags: v6.6.25, v6.6.24, v6.6.23, v6.6.16, v6.6.15, v6.6.14, v6.6.13, v6.6.12, v6.6.11, v6.6.10, v6.6.9, v6.6.8, v6.6.7, v6.6.6, v6.6.5, v6.6.4, v6.6.3, v6.6.2, v6.5.11, v6.6.1, v6.5.10, v6.6, v6.5.9, v6.5.8, v6.5.7, v6.5.6, v6.5.5, v6.5.4, v6.5.3, v6.5.2, v6.1.51, v6.5.1, v6.1.50, v6.5, v6.1.49, v6.1.48
# 55dd4023 18-Aug-2023 Yi Liu <yi.l.liu@intel.com>

iommufd: Add IOMMU_GET_HW_INFO

Under nested IOMMU translation, userspace owns the stage-1 translation
table (e.g. the stage-1 page table of Intel VT-d or the context table of
ARM SMMUv3, and etc.).

iommufd: Add IOMMU_GET_HW_INFO

Under nested IOMMU translation, userspace owns the stage-1 translation
table (e.g. the stage-1 page table of Intel VT-d or the context table of
ARM SMMUv3, and etc.). Stage-1 translation tables are vendor specific, and
need to be compatible with the underlying IOMMU hardware. Hence, userspace
should know the IOMMU hardware capability before creating and configuring
the stage-1 translation table to kernel.

This adds IOMMU_GET_HW_INFO ioctl to query the IOMMU hardware information
(a.k.a capability) for a given device. The returned data is vendor
specific, userspace needs to decode it with the structure by the output
@out_data_type field.

As only physical devices have IOMMU hardware, so this will return error if
the given device is not a physical device.

Link: https://lore.kernel.org/r/20230818101033.4100-4-yi.l.liu@intel.com
Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Co-developed-by: Nicolin Chen <nicolinc@nvidia.com>
Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
Signed-off-by: Yi Liu <yi.l.liu@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>

show more ...


Revision tags: v6.1.46
# 65aaca11 14-Aug-2023 Jason Gunthorpe <jgg@nvidia.com>

iommufd: Remove iommufd_ref_to_users()

This no longer has any callers, remove the function

Kevin noticed that after commit 99f98a7c0d69 ("iommufd: IOMMUFD_DESTROY
should not increase the refcount")

iommufd: Remove iommufd_ref_to_users()

This no longer has any callers, remove the function

Kevin noticed that after commit 99f98a7c0d69 ("iommufd: IOMMUFD_DESTROY
should not increase the refcount") there was only one other user and it
turns out the rework in commit 9227da7816dd ("iommufd: Add
iommufd_access_change_ioas(_id) helpers") got rid of the last one.

Link: https://lore.kernel.org/r/0-v1-abb31bedd888+c1-iommufd_ref_to_users_jgg@nvidia.com
Suggested-by: Kevin Tian <kevin.tian@intel.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>

show more ...


Revision tags: v6.1.45, v6.1.44, v6.1.43
# 23a1b46f 02-Aug-2023 Jason Gunthorpe <jgg@nvidia.com>

iommufd/selftest: Make the mock iommu driver into a real driver

I've avoided doing this because there is no way to make this happen
without an intrusion into the core code. Up till now this has avoi

iommufd/selftest: Make the mock iommu driver into a real driver

I've avoided doing this because there is no way to make this happen
without an intrusion into the core code. Up till now this has avoided
needing the core code's probe path with some hackery - but now that
default domains are becoming mandatory it is unavoidable.

This became a serious problem when the core code stopped allowing
partially registered iommu drivers in commit 14891af3799e ("iommu: Move
the iommu driver sysfs setup into iommu_init/deinit_device()") which
breaks the selftest. That series was developed along with a second series
that contained this patch so it was not noticed.

Make it so that iommufd selftest can create a real iommu driver and bind
it only to is own private bus. Add iommu_device_register_bus() as a core
code helper to make this possible. It simply sets the right pointers and
registers the notifier block. The mock driver then works like any normal
driver should, with probe triggered by the bus ops

When the bus->iommu_ops stuff is fully unwound we can probably do better
here and remove this special case.

Link: https://lore.kernel.org/r/15-v6-e8114faedade+425-iommu_all_defdom_jgg@nvidia.com
Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>

show more ...


# 5d5c85ff 28-Jul-2023 Nicolin Chen <nicolinc@nvidia.com>

iommufd: Allow passing in iopt_access_list_id to iopt_remove_access()

This is a preparatory change for ioas replacement support for accesses.
The replacement routine does an iopt_add_access() for a

iommufd: Allow passing in iopt_access_list_id to iopt_remove_access()

This is a preparatory change for ioas replacement support for accesses.
The replacement routine does an iopt_add_access() for a new IOAS first and
then iopt_remove_access() for the old IOAS upon the success of the first
call. However, the first call overrides the iopt_access_list_id in the
access struct, resulting in iopt_remove_access() being unable to work on
the old IOAS.

Add an iopt_access_list_id as a parameter to iopt_remove_access, so the
replacement routine can save the id before it gets overwritten. Pass the
id in iopt_remove_access() for a proper cleanup.

The existing callers should just pass in access->iopt_access_list_id.

Link: https://lore.kernel.org/r/7bb939b9e0102da0c099572bb3de78ab7622221e.1690523699.git.nicolinc@nvidia.com
Suggested-by: Jason Gunthorpe <jgg@nvidia.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>

show more ...


Revision tags: v6.1.42
# 99f98a7c 25-Jul-2023 Jason Gunthorpe <jgg@nvidia.com>

iommufd: IOMMUFD_DESTROY should not increase the refcount

syzkaller found a race where IOMMUFD_DESTROY increments the refcount:

obj = iommufd_get_object(ucmd->ictx, cmd->id, IOMMUFD_OBJ_ANY)

iommufd: IOMMUFD_DESTROY should not increase the refcount

syzkaller found a race where IOMMUFD_DESTROY increments the refcount:

obj = iommufd_get_object(ucmd->ictx, cmd->id, IOMMUFD_OBJ_ANY);
if (IS_ERR(obj))
return PTR_ERR(obj);
iommufd_ref_to_users(obj);
/* See iommufd_ref_to_users() */
if (!iommufd_object_destroy_user(ucmd->ictx, obj))

As part of the sequence to join the two existing primitives together.

Allowing the refcount the be elevated without holding the destroy_rwsem
violates the assumption that all temporary refcount elevations are
protected by destroy_rwsem. Racing IOMMUFD_DESTROY with
iommufd_object_destroy_user() will cause spurious failures:

WARNING: CPU: 0 PID: 3076 at drivers/iommu/iommufd/device.c:477 iommufd_access_destroy+0x18/0x20 drivers/iommu/iommufd/device.c:478
Modules linked in:
CPU: 0 PID: 3076 Comm: syz-executor.0 Not tainted 6.3.0-rc1-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 07/03/2023
RIP: 0010:iommufd_access_destroy+0x18/0x20 drivers/iommu/iommufd/device.c:477
Code: e8 3d 4e 00 00 84 c0 74 01 c3 0f 0b c3 0f 1f 44 00 00 f3 0f 1e fa 48 89 fe 48 8b bf a8 00 00 00 e8 1d 4e 00 00 84 c0 74 01 c3 <0f> 0b c3 0f 1f 44 00 00 41 57 41 56 41 55 4c 8d ae d0 00 00 00 41
RSP: 0018:ffffc90003067e08 EFLAGS: 00010246
RAX: 0000000000000000 RBX: ffff888109ea0300 RCX: 0000000000000000
RDX: 0000000000000001 RSI: 0000000000000000 RDI: 00000000ffffffff
RBP: 0000000000000004 R08: 0000000000000000 R09: ffff88810bbb3500
R10: ffff88810bbb3e48 R11: 0000000000000000 R12: ffffc90003067e88
R13: ffffc90003067ea8 R14: ffff888101249800 R15: 00000000fffffffe
FS: 00007ff7254fe6c0(0000) GS:ffff888237c00000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000555557262da8 CR3: 000000010a6fd000 CR4: 0000000000350ef0
Call Trace:
<TASK>
iommufd_test_create_access drivers/iommu/iommufd/selftest.c:596 [inline]
iommufd_test+0x71c/0xcf0 drivers/iommu/iommufd/selftest.c:813
iommufd_fops_ioctl+0x10f/0x1b0 drivers/iommu/iommufd/main.c:337
vfs_ioctl fs/ioctl.c:51 [inline]
__do_sys_ioctl fs/ioctl.c:870 [inline]
__se_sys_ioctl fs/ioctl.c:856 [inline]
__x64_sys_ioctl+0x84/0xc0 fs/ioctl.c:856
do_syscall_x64 arch/x86/entry/common.c:50 [inline]
do_syscall_64+0x38/0x80 arch/x86/entry/common.c:80
entry_SYSCALL_64_after_hwframe+0x63/0xcd

The solution is to not increment the refcount on the IOMMUFD_DESTROY path
at all. Instead use the xa_lock to serialize everything. The refcount
check == 1 and xa_erase can be done under a single critical region. This
avoids the need for any refcount incrementing.

It has the downside that if userspace races destroy with other operations
it will get an EBUSY instead of waiting, but this is kind of racing is
already dangerous.

Fixes: 2ff4bed7fee7 ("iommufd: File descriptor, context, kconfig and makefiles")
Link: https://lore.kernel.org/r/2-v1-85aacb2af554+bc-iommufd_syz3_jgg@nvidia.com
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Reported-by: syzbot+7574ebfe589049630608@syzkaller.appspotmail.com
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>

show more ...


Revision tags: v6.1.41, v6.1.40, v6.1.39
# 7074d7bd 17-Jul-2023 Jason Gunthorpe <jgg@nvidia.com>

iommufd: Add IOMMU_HWPT_ALLOC

This allows userspace to manually create HWPTs on IOAS's and then use
those HWPTs as inputs to iommufd_device_attach/replace().

Following series will extend this to al

iommufd: Add IOMMU_HWPT_ALLOC

This allows userspace to manually create HWPTs on IOAS's and then use
those HWPTs as inputs to iommufd_device_attach/replace().

Following series will extend this to allow creating iommu_domains with
driver specific parameters.

Link: https://lore.kernel.org/r/17-v8-6659224517ea+532-iommufd_alloc_jgg@nvidia.com
Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Tested-by: Nicolin Chen <nicolinc@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>

show more ...


# 83f7bc6f 17-Jul-2023 Jason Gunthorpe <jgg@nvidia.com>

iommufd: Make destroy_rwsem use a lock class per object type

The selftest invokes things like replace under the object lock of its
idev which protects the idev in a similar way to a real user.
Unfor

iommufd: Make destroy_rwsem use a lock class per object type

The selftest invokes things like replace under the object lock of its
idev which protects the idev in a similar way to a real user.
Unfortunately this triggers lockdep. A lock class per type will solve the
problem.

Link: https://lore.kernel.org/r/15-v8-6659224517ea+532-iommufd_alloc_jgg@nvidia.com
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Tested-by: Nicolin Chen <nicolinc@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>

show more ...


# 70eadc7f 17-Jul-2023 Jason Gunthorpe <jgg@nvidia.com>

iommufd: Allow a hwpt to be aborted after allocation

During creation the hwpt must have the ioas->mutex held until the object
is finalized. This means we need to be able to call
iommufd_object_abort

iommufd: Allow a hwpt to be aborted after allocation

During creation the hwpt must have the ioas->mutex held until the object
is finalized. This means we need to be able to call
iommufd_object_abort_and_destroy() while holding the mutex.

Since iommufd_hw_pagetable_destroy() also needs the mutex this is
problematic.

Fix it by creating a special abort op for the object that can assume the
caller is holding the lock, as required by the contract.

The next patch will add another iommufd_object_abort_and_destroy() for a
hwpt.

Fixes: e8d57210035b ("iommufd: Add kAPI toward external drivers for physical devices")
Link: https://lore.kernel.org/r/10-v8-6659224517ea+532-iommufd_alloc_jgg@nvidia.com
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Tested-by: Nicolin Chen <nicolinc@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>

show more ...


# 17bad527 17-Jul-2023 Jason Gunthorpe <jgg@nvidia.com>

iommufd: Add enforced_cache_coherency to iommufd_hw_pagetable_alloc()

Logically the HWPT should have the coherency set properly for the device
that it is being created for when it is created.

This

iommufd: Add enforced_cache_coherency to iommufd_hw_pagetable_alloc()

Logically the HWPT should have the coherency set properly for the device
that it is being created for when it is created.

This was happening implicitly if the immediate_attach was set because
iommufd_hw_pagetable_attach() does it as the first thing.

Do it unconditionally so !immediate_attach works properly.

Link: https://lore.kernel.org/r/9-v8-6659224517ea+532-iommufd_alloc_jgg@nvidia.com
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com>
Tested-by: Nicolin Chen <nicolinc@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>

show more ...


# d03f1336 17-Jul-2023 Jason Gunthorpe <jgg@nvidia.com>

iommufd: Move putting a hwpt to a helper function

Next patch will need to call this from two places.

Link: https://lore.kernel.org/r/8-v8-6659224517ea+532-iommufd_alloc_jgg@nvidia.com
Reviewed-by:

iommufd: Move putting a hwpt to a helper function

Next patch will need to call this from two places.

Link: https://lore.kernel.org/r/8-v8-6659224517ea+532-iommufd_alloc_jgg@nvidia.com
Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Tested-by: Nicolin Chen <nicolinc@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>

show more ...


# 1d149ab2 17-Jul-2023 Jason Gunthorpe <jgg@nvidia.com>

iommufd: Make sw_msi_start a group global

The sw_msi_start is only set by the ARM drivers and it is always constant.
Due to the way vfio/iommufd allow domains to be re-used between
devices we have a

iommufd: Make sw_msi_start a group global

The sw_msi_start is only set by the ARM drivers and it is always constant.
Due to the way vfio/iommufd allow domains to be re-used between
devices we have a built in assumption that there is only one value
for sw_msi_start and it is global to the system.

To make replace simpler where we may not reparse the
iommu_get_resv_regions() move the sw_msi_start to the iommufd_group so it
is always available once any HWPT has been attached.

Link: https://lore.kernel.org/r/7-v8-6659224517ea+532-iommufd_alloc_jgg@nvidia.com
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Tested-by: Nicolin Chen <nicolinc@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>

show more ...


# 34f327a9 17-Jul-2023 Jason Gunthorpe <jgg@nvidia.com>

iommufd: Keep track of each device's reserved regions instead of groups

The driver facing API in the iommu core makes the reserved regions
per-device. An algorithm in the core code consolidates the

iommufd: Keep track of each device's reserved regions instead of groups

The driver facing API in the iommu core makes the reserved regions
per-device. An algorithm in the core code consolidates the regions of all
the devices in a group to return the group view.

To allow for devices to be hotplugged into the group iommufd would re-load
the entire group's reserved regions for each device, just in case they
changed.

Further iommufd already has to deal with duplicated/overlapping reserved
regions as it must union all the groups together.

Thus simplify all of this to just use the device reserved regions
interface directly from the iommu driver.

Link: https://lore.kernel.org/r/5-v8-6659224517ea+532-iommufd_alloc_jgg@nvidia.com
Suggested-by: Kevin Tian <kevin.tian@intel.com>
Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Tested-by: Nicolin Chen <nicolinc@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>

show more ...


# 91a2e17e 17-Jul-2023 Jason Gunthorpe <jgg@nvidia.com>

iommufd: Replace the hwpt->devices list with iommufd_group

The devices list was used as a simple way to avoid having per-group
information. Now that this seems to be unavoidable, just commit to
per-

iommufd: Replace the hwpt->devices list with iommufd_group

The devices list was used as a simple way to avoid having per-group
information. Now that this seems to be unavoidable, just commit to
per-group information fully and remove the devices list from the HWPT.

The iommufd_group stores the currently assigned HWPT for the entire group
and we can manage the per-device attach/detach with a list in the
iommufd_group.

For destruction the flow is organized to make the following patches
easier, the actual call to iommufd_object_destroy_user() is done at the
top of the call chain without holding any locks. The HWPT to be destroyed
is returned out from the locked region to make this possible. Later
patches create locking that requires this.

Link: https://lore.kernel.org/r/3-v8-6659224517ea+532-iommufd_alloc_jgg@nvidia.com
Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Tested-by: Nicolin Chen <nicolinc@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>

show more ...


# 3a3329a7 17-Jul-2023 Jason Gunthorpe <jgg@nvidia.com>

iommufd: Add iommufd_group

When the hwpt to device attachment is fairly static we could get away with
the simple approach of keeping track of the groups via a device list. But
with replace this is i

iommufd: Add iommufd_group

When the hwpt to device attachment is fairly static we could get away with
the simple approach of keeping track of the groups via a device list. But
with replace this is infeasible.

Add an automatically managed struct that is 1:1 with the iommu_group
per-ictx so we can store the necessary tracking information there.

Link: https://lore.kernel.org/r/2-v8-6659224517ea+532-iommufd_alloc_jgg@nvidia.com
Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Tested-by: Nicolin Chen <nicolinc@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>

show more ...


# e23a6217 18-Jul-2023 Nicolin Chen <nicolinc@nvidia.com>

iommufd/device: Add iommufd_access_detach() API

Previously, the detach routine is only done by the destroy(). And it was
called by vfio_iommufd_emulated_unbind() when the device runs close(), so
all

iommufd/device: Add iommufd_access_detach() API

Previously, the detach routine is only done by the destroy(). And it was
called by vfio_iommufd_emulated_unbind() when the device runs close(), so
all the mappings in iopt were cleaned in that setup, when the call trace
reaches this detach() routine.

Now, there's a need of a detach uAPI, meaning that it does not only need
a new iommufd_access_detach() API, but also requires access->ops->unmap()
call as a cleanup. So add one.

However, leaving that unprotected can introduce some potential of a race
condition during the pin_/unpin_pages() call, where access->ioas->iopt is
getting referenced. So, add an ioas_lock to protect the context of iopt
referencings.

Also, to allow the iommufd_access_unpin_pages() callback to happen via
this unmap() call, add an ioas_unpin pointer, so the unpin routine won't
be affected by the "access->ioas = NULL" trick.

Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Tested-by: Terrence Xu <terrence.xu@intel.com>
Tested-by: Nicolin Chen <nicolinc@nvidia.com>
Tested-by: Matthew Rosato <mjrosato@linux.ibm.com>
Tested-by: Yanting Jiang <yanting.jiang@intel.com>
Tested-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com>
Tested-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
Signed-off-by: Yi Liu <yi.l.liu@intel.com>
Link: https://lore.kernel.org/r/20230718135551.6592-15-yi.l.liu@intel.com
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>

show more ...


Revision tags: v6.1.38, v6.1.37, v6.1.36, v6.4, v6.1.35, v6.1.34, v6.1.33, v6.1.32, v6.1.31, v6.1.30, v6.1.29, v6.1.28, v6.1.27, v6.1.26, v6.3, v6.1.25, v6.1.24, v6.1.23, v6.1.22
# 325de950 27-Mar-2023 Yi Liu <yi.l.liu@intel.com>

iommu/iommufd: Pass iommufd_ctx pointer in iommufd_get_ioas()

No need to pass the iommufd_ucmd pointer.

Link: https://lore.kernel.org/r/20230327093351.44505-2-yi.l.liu@intel.com
Signed-off-by: Yi L

iommu/iommufd: Pass iommufd_ctx pointer in iommufd_get_ioas()

No need to pass the iommufd_ucmd pointer.

Link: https://lore.kernel.org/r/20230327093351.44505-2-yi.l.liu@intel.com
Signed-off-by: Yi Liu <yi.l.liu@intel.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Tested-by: Nicolin Chen <nicolinc@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>

show more ...


Revision tags: v6.1.21, v6.1.20, v6.1.19, v6.1.18, v6.1.17, v6.1.16, v6.1.15
# 65c619ae 01-Mar-2023 Jason Gunthorpe <jgg@nvidia.com>

iommufd/selftest: Make selftest create a more complete mock device

iommufd wants to use more infrastructure, like the iommu_group, that the
mock device does not support. Create a more complete mock

iommufd/selftest: Make selftest create a more complete mock device

iommufd wants to use more infrastructure, like the iommu_group, that the
mock device does not support. Create a more complete mock device that can
go through the whole cycle of ownership, blocking domain, and has an
iommu_group.

This requires creating a real struct device on a real bus to be able to
connect it to a iommu_group. Unfortunately we cannot formally attach the
mock iommu driver as an actual driver as the iommu core does not allow
more than one driver or provide a general way for busses to link to
iommus. This can be solved with a little hack to open code the dev_iommus
struct.

With this infrastructure things work exactly the same as the normal domain
path, including the auto domains mechanism and direct attach of hwpts. As
the created hwpt is now an autodomain it is no longer required to destroy
it and trying to do so will trigger a failure.

Link: https://lore.kernel.org/r/11-v3-ae9c2975a131+2e1e8-iommufd_hwpt_jgg@nvidia.com
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>

show more ...


# 339fbf3a 01-Mar-2023 Jason Gunthorpe <jgg@nvidia.com>

iommufd: Make iommufd_hw_pagetable_alloc() do iopt_table_add_domain()

The HWPT is always linked to an IOAS and once a HWPT exists its domain
should be fully mapped. This ended up being split up into

iommufd: Make iommufd_hw_pagetable_alloc() do iopt_table_add_domain()

The HWPT is always linked to an IOAS and once a HWPT exists its domain
should be fully mapped. This ended up being split up into device.c during
a two phase creation that was a bit confusing.

Move the iopt_table_add_domain() into iommufd_hw_pagetable_alloc() by
having it call back to device.c to complete the domain attach in the
required order.

Calling iommufd_hw_pagetable_alloc() with immediate_attach = false will
work on most drivers, but notably the SMMU drivers will fail because they
can't decide what kind of domain to create until they are attached. This
will be fixed when the domain_alloc function can take in a struct device.

Link: https://lore.kernel.org/r/6-v3-ae9c2975a131+2e1e8-iommufd_hwpt_jgg@nvidia.com
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>

show more ...


# 7e7ec8a5 01-Mar-2023 Jason Gunthorpe <jgg@nvidia.com>

iommufd: Move iommufd_device to iommufd_private.h

hw_pagetable.c will need this in the next patches.

Link: https://lore.kernel.org/r/5-v3-ae9c2975a131+2e1e8-iommufd_hwpt_jgg@nvidia.com
Reviewed-by:

iommufd: Move iommufd_device to iommufd_private.h

hw_pagetable.c will need this in the next patches.

Link: https://lore.kernel.org/r/5-v3-ae9c2975a131+2e1e8-iommufd_hwpt_jgg@nvidia.com
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>

show more ...


Revision tags: v6.1.14, v6.1.13, v6.2, v6.1.12, v6.1.11, v6.1.10, v6.1.9, v6.1.8
# c9a397ce 18-Jan-2023 Jason Gunthorpe <jgg@nvidia.com>

vfio: Support VFIO_NOIOMMU with iommufd

Add a small amount of emulation to vfio_compat to accept the SET_IOMMU to
VFIO_NOIOMMU_IOMMU and have vfio just ignore iommufd if it is working on a
no-iommu

vfio: Support VFIO_NOIOMMU with iommufd

Add a small amount of emulation to vfio_compat to accept the SET_IOMMU to
VFIO_NOIOMMU_IOMMU and have vfio just ignore iommufd if it is working on a
no-iommu enabled device.

Move the enable_unsafe_noiommu_mode module out of container.c into
vfio_main.c so that it is always available even if VFIO_CONTAINER=n.

This passes Alex's mini-test:

https://github.com/awilliam/tests/blob/master/vfio-noiommu-pci-device-open.c

Link: https://lore.kernel.org/r/0-v3-480cd64a16f7+1ad0-iommufd_noiommu_jgg@nvidia.com
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Acked-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>

show more ...


Revision tags: v6.1.7, v6.1.6, v6.1.5, v6.0.19, v6.0.18, v6.1.4, v6.1.3, v6.0.17, v6.1.2, v6.0.16, v6.1.1, v6.0.15, v6.0.14, v6.0.13, v6.1, v6.0.12, v6.0.11
# f4b20bb3 29-Nov-2022 Jason Gunthorpe <jgg@nvidia.com>

iommufd: Add kernel support for testing iommufd

Provide a mock kernel module for the iommu_domain that allows it to run
without any HW and the mocking provides a way to directly validate that
the PF

iommufd: Add kernel support for testing iommufd

Provide a mock kernel module for the iommu_domain that allows it to run
without any HW and the mocking provides a way to directly validate that
the PFNs loaded into the iommu_domain are correct. This exposes the access
kAPI toward userspace to allow userspace to explore the functionality of
pages.c and io_pagetable.c

The mock also simulates the rare case of PAGE_SIZE > iommu page size as
the mock will operate at a 2K iommu page size. This allows exercising all
of the calculations to support this mismatch.

This is also intended to support syzkaller exploring the same space.

However, it is an unusually invasive config option to enable all of
this. The config option should not be enabled in a production kernel.

Link: https://lore.kernel.org/r/16-v6-a196d26f289e+11787-iommufd_jgg@nvidia.com
Tested-by: Matthew Rosato <mjrosato@linux.ibm.com> # s390
Tested-by: Eric Auger <eric.auger@redhat.com> # aarch64
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>

show more ...


# d624d665 29-Nov-2022 Jason Gunthorpe <jgg@nvidia.com>

iommufd: vfio container FD ioctl compatibility

iommufd can directly implement the /dev/vfio/vfio container IOCTLs by
mapping them into io_pagetable operations.

A userspace application can test agai

iommufd: vfio container FD ioctl compatibility

iommufd can directly implement the /dev/vfio/vfio container IOCTLs by
mapping them into io_pagetable operations.

A userspace application can test against iommufd and confirm compatibility
then simply make a small change to open /dev/iommu instead of
/dev/vfio/vfio.

For testing purposes /dev/vfio/vfio can be symlinked to /dev/iommu and
then all applications will use the compatibility path with no code
changes. A later series allows /dev/vfio/vfio to be directly provided by
iommufd, which allows the rlimit mode to work the same as well.

This series just provides the iommufd side of compatibility. Actually
linking this to VFIO_SET_CONTAINER is a followup series, with a link in
the cover letter.

Internally the compatibility API uses a normal IOAS object that, like
vfio, is automatically allocated when the first device is
attached.

Userspace can also query or set this IOAS object directly using the
IOMMU_VFIO_IOAS ioctl. This allows mixing and matching new iommufd only
features while still using the VFIO style map/unmap ioctls.

While this is enough to operate qemu, it has a few differences:

- Resource limits rely on memory cgroups to bound what userspace can do
instead of the module parameter dma_entry_limit.

- VFIO P2P is not implemented. The DMABUF patches for vfio are a start at
a solution where iommufd would import a special DMABUF. This is to avoid
further propogating the follow_pfn() security problem.

- A full audit for pedantic compatibility details (eg errnos, etc) has
not yet been done

- powerpc SPAPR is left out, as it is not connected to the iommu_domain
framework. It seems interest in SPAPR is minimal as it is currently
non-working in v6.1-rc1. They will have to convert to the iommu
subsystem framework to enjoy iommfd.

The following are not going to be implemented and we expect to remove them
from VFIO type1:

- SW access 'dirty tracking'. As discussed in the cover letter this will
be done in VFIO.

- VFIO_TYPE1_NESTING_IOMMU
https://lore.kernel.org/all/0-v1-0093c9b0e345+19-vfio_no_nesting_jgg@nvidia.com/

- VFIO_DMA_MAP_FLAG_VADDR
https://lore.kernel.org/all/Yz777bJZjTyLrHEQ@nvidia.com/

Link: https://lore.kernel.org/r/15-v6-a196d26f289e+11787-iommufd_jgg@nvidia.com
Tested-by: Nicolin Chen <nicolinc@nvidia.com>
Tested-by: Yi Liu <yi.l.liu@intel.com>
Tested-by: Lixiao Yang <lixiao.yang@intel.com>
Tested-by: Matthew Rosato <mjrosato@linux.ibm.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>

show more ...


# 8d40205f 29-Nov-2022 Jason Gunthorpe <jgg@nvidia.com>

iommufd: Add kAPI toward external drivers for kernel access

Kernel access is the mode that VFIO "mdevs" use. In this case there is no
struct device and no IOMMU connection. iommufd acts as a record

iommufd: Add kAPI toward external drivers for kernel access

Kernel access is the mode that VFIO "mdevs" use. In this case there is no
struct device and no IOMMU connection. iommufd acts as a record keeper for
accesses and returns the actual struct pages back to the caller to use
however they need. eg with kmap or the DMA API.

Each caller must create a struct iommufd_access with
iommufd_access_create(), similar to how iommufd_device_bind() works. Using
this struct the caller can access blocks of IOVA using
iommufd_access_pin_pages() or iommufd_access_rw().

Callers must provide a callback that immediately unpins any IOVA being
used within a range. This happens if userspace unmaps the IOVA under the
pin.

The implementation forwards the access requests directly to the iopt
infrastructure that manages the iopt_pages_access.

Link: https://lore.kernel.org/r/14-v6-a196d26f289e+11787-iommufd_jgg@nvidia.com
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Tested-by: Nicolin Chen <nicolinc@nvidia.com>
Tested-by: Yi Liu <yi.l.liu@intel.com>
Tested-by: Lixiao Yang <lixiao.yang@intel.com>
Tested-by: Matthew Rosato <mjrosato@linux.ibm.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>

show more ...


# e8d57210 29-Nov-2022 Jason Gunthorpe <jgg@nvidia.com>

iommufd: Add kAPI toward external drivers for physical devices

Add the four functions external drivers need to connect physical DMA to
the IOMMUFD:

iommufd_device_bind() / iommufd_device_unbind()

iommufd: Add kAPI toward external drivers for physical devices

Add the four functions external drivers need to connect physical DMA to
the IOMMUFD:

iommufd_device_bind() / iommufd_device_unbind()
Register the device with iommufd and establish security isolation.

iommufd_device_attach() / iommufd_device_detach()
Connect a bound device to a page table

Binding a device creates a device object ID in the uAPI, however the
generic API does not yet provide any IOCTLs to manipulate them.

Link: https://lore.kernel.org/r/13-v6-a196d26f289e+11787-iommufd_jgg@nvidia.com
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Tested-by: Nicolin Chen <nicolinc@nvidia.com>
Tested-by: Yi Liu <yi.l.liu@intel.com>
Tested-by: Lixiao Yang <lixiao.yang@intel.com>
Tested-by: Matthew Rosato <mjrosato@linux.ibm.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>

show more ...


# ea4acfac 29-Nov-2022 Jason Gunthorpe <jgg@nvidia.com>

iommufd: Add a HW pagetable object

The hw_pagetable object exposes the internal struct iommu_domain's to
userspace. An iommu_domain is required when any DMA device attaches to an
IOAS to control the

iommufd: Add a HW pagetable object

The hw_pagetable object exposes the internal struct iommu_domain's to
userspace. An iommu_domain is required when any DMA device attaches to an
IOAS to control the io page table through the iommu driver.

For compatibility with VFIO the hw_pagetable is automatically created when
a DMA device is attached to the IOAS. If a compatible iommu_domain already
exists then the hw_pagetable associated with it is used for the
attachment.

In the initial series there is no iommufd uAPI for the hw_pagetable
object. The next patch provides driver facing APIs for IO page table
attachment that allows drivers to accept either an IOAS or a hw_pagetable
ID and for the driver to return the hw_pagetable ID that was auto-selected
from an IOAS. The expectation is the driver will provide uAPI through its
own FD for attaching its device to iommufd. This allows userspace to learn
the mapping of devices to iommu_domains and to override the automatic
attachment.

The future HW specific interface will allow userspace to create
hw_pagetable objects using iommu_domains with IOMMU driver specific
parameters. This infrastructure will allow linking those domains to IOAS's
and devices.

Link: https://lore.kernel.org/r/12-v6-a196d26f289e+11787-iommufd_jgg@nvidia.com
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Tested-by: Nicolin Chen <nicolinc@nvidia.com>
Tested-by: Yi Liu <yi.l.liu@intel.com>
Tested-by: Lixiao Yang <lixiao.yang@intel.com>
Tested-by: Matthew Rosato <mjrosato@linux.ibm.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>

show more ...


12