#
d1203852 |
| 18-Jun-2015 |
Gavin Shan <gwshan@linux.vnet.ibm.com> |
powerpc/powernv: Boolean argument for pnv_ioda_setup_bus_PE()
The patch changes the type of last argument of pnv_ioda_setup_bus_PE() and phb::pick_m64_pe() to boolean. No functional change.
Signed-
powerpc/powernv: Boolean argument for pnv_ioda_setup_bus_PE()
The patch changes the type of last argument of pnv_ioda_setup_bus_PE() and phb::pick_m64_pe() to boolean. No functional change.
Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
show more ...
|
#
96a2f92b |
| 18-Jun-2015 |
Gavin Shan <gwshan@linux.vnet.ibm.com> |
powerpc/powernv: Reserve M64 PEs based on BARs
On PHB3, some PEs might be reserved in advance to reflect the M64 segments consumed by those PEs. We're reserving PEs based on the M64 window of root p
powerpc/powernv: Reserve M64 PEs based on BARs
On PHB3, some PEs might be reserved in advance to reflect the M64 segments consumed by those PEs. We're reserving PEs based on the M64 window of root port, which might contain VF BAR. The PEs for VFs are allocated dynamically, not reserved based on the consumed M64 segments. So the M64 window of root port isn't reliable for the task. Instead, we go through M64 BARs (VF BARs excluded) of PCI devices under the specified root bus and reserve PEs accordingly, as the patch does.
Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
show more ...
|
#
e9dc4d7f |
| 18-Jun-2015 |
Gavin Shan <gwshan@linux.vnet.ibm.com> |
powerpc/powernv: Allow to reserve one PE for multiple times
The PE numbers are reserved according to root port's M64 window, which is aligned to M64 segment finely. So one PE shouldn't be reserved f
powerpc/powernv: Allow to reserve one PE for multiple times
The PE numbers are reserved according to root port's M64 window, which is aligned to M64 segment finely. So one PE shouldn't be reserved for multiple times. We will reserve PE numbers according to the M64 BARs of PCI device in subsequent patches, which aren't aligned to M64 segment size finely. It means one particular PE could be reserved for multiple times.
The patch allows one PE to be reserved for multiple times and we print the warning message at debugging level.
Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
show more ...
|
#
e91c2511 |
| 24-Jun-2015 |
Benjamin Herrenschmidt <benh@kernel.crashing.org> |
powerpc/iommu: Cleanup setting of DMA base/offset
Now that the table and the offset can co-exist, we no longer need to flip/flop, we can just establish both once at boot time.
Signed-off-by: Benjam
powerpc/iommu: Cleanup setting of DMA base/offset
Now that the table and the offset can co-exist, we no longer need to flip/flop, we can just establish both once at boot time.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
show more ...
|
#
5c89a87d |
| 17-Jun-2015 |
Alexey Kardashevskiy <aik@ozlabs.ru> |
powerpc/powernv: Fix wrong IOMMU table in pnv_ioda_setup_bus_dma()
When pnv_pci_ioda_fixup() is called during PHB fixup time, each PE in the sorted list of PEs (phb::pe_dma_list) is iterated to setu
powerpc/powernv: Fix wrong IOMMU table in pnv_ioda_setup_bus_dma()
When pnv_pci_ioda_fixup() is called during PHB fixup time, each PE in the sorted list of PEs (phb::pe_dma_list) is iterated to setup the PE's DMA32 space by pnv_ioda_setup_bus_dma() if the PE's DMA32 weight is bigger than zero. The function also assigns all the subordinate PCI devices of the PE's primary bus with the PE's DMA32 IOMMU table. It causes the PCI devicess in the child PEs, which don't have DMA weight, receives wrong IOMMU table and then IOMMU group.
The patch fixes above issue by more check on the PE's coverage and don't assign IOMMU table to those PCI devices, which belong to the child PEs. The problem was found on Firestone platform initially.
Suggested-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
show more ...
|
#
b5926430 |
| 15-Jun-2015 |
Alexey Kardashevskiy <aik@ozlabs.ru> |
powerpc/iommu/ioda2: Enable compile with IOV=on and IOMMU_API=off
The pnv_pci_ioda2_unset_window() function is used to do the final cleanup of a DMA window being released: - via VFIO ioctl by the gu
powerpc/iommu/ioda2: Enable compile with IOV=on and IOMMU_API=off
The pnv_pci_ioda2_unset_window() function is used to do the final cleanup of a DMA window being released: - via VFIO ioctl by the guest request; - via unplugging a virtual PCI function. However the function was under #ifdef CONFIG_IOMMU_API and was missing.
This moves the helper outside of IOMMU_API block and enables it for either or both IOMMU_API and PCI_IOV.
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
show more ...
|
Revision tags: v4.1-rc8, v4.1-rc7 |
|
#
46d3e1e1 |
| 05-Jun-2015 |
Alexey Kardashevskiy <aik@ozlabs.ru> |
vfio: powerpc/spapr: powerpc/powernv/ioda2: Use DMA windows API in ownership control
Before the IOMMU user (VFIO) would take control over the IOMMU table belonging to a specific IOMMU group. This ap
vfio: powerpc/spapr: powerpc/powernv/ioda2: Use DMA windows API in ownership control
Before the IOMMU user (VFIO) would take control over the IOMMU table belonging to a specific IOMMU group. This approach did not allow sharing tables between IOMMU groups attached to the same container.
This introduces a new IOMMU ownership flavour when the user can not just control the existing IOMMU table but remove/create tables on demand. If an IOMMU implements take/release_ownership() callbacks, this lets the user have full control over the IOMMU group. When the ownership is taken, the platform code removes all the windows so the caller must create them. Before returning the ownership back to the platform code, VFIO unprograms and removes all the tables it created.
This changes IODA2's onwership handler to remove the existing table rather than manipulating with the existing one. From now on, iommu_take_ownership() and iommu_release_ownership() are only called from the vfio_iommu_spapr_tce driver.
Old-style ownership is still supported allowing VFIO to run on older P5IOC2 and IODA IO controllers.
No change in userspace-visible behaviour is expected. Since it recreates TCE tables on each ownership change, related kernel traces will appear more often.
This adds a pnv_pci_ioda2_setup_default_config() which is called when PE is being configured at boot time and when the ownership is passed from VFIO to the platform code.
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> [aw: for the vfio related changes] Acked-by: Alex Williamson <alex.williamson@redhat.com> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
show more ...
|
#
00547193 |
| 05-Jun-2015 |
Alexey Kardashevskiy <aik@ozlabs.ru> |
powerpc/iommu/ioda2: Add get_table_size() to calculate the size of future table
This adds a way for the IOMMU user to know how much a new table will use so it can be accounted in the locked_vm limit
powerpc/iommu/ioda2: Add get_table_size() to calculate the size of future table
This adds a way for the IOMMU user to know how much a new table will use so it can be accounted in the locked_vm limit before allocation happens.
This stores the allocated table size in pnv_pci_ioda2_get_table_size() so the locked_vm counter can be updated correctly when a table is being disposed.
This defines an iommu_table_group_ops callback to let VFIO know how much memory will be locked if a table is created.
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
show more ...
|
#
c035e37b |
| 05-Jun-2015 |
Alexey Kardashevskiy <aik@ozlabs.ru> |
powerpc/powernv/ioda2: Use new helpers to do proper cleanup on PE release
The existing code programmed TVT#0 with some address and then immediately released that memory.
This makes use of pnv_pci_i
powerpc/powernv/ioda2: Use new helpers to do proper cleanup on PE release
The existing code programmed TVT#0 with some address and then immediately released that memory.
This makes use of pnv_pci_ioda2_unset_window() and pnv_pci_ioda2_set_bypass() which do correct resource release and TVT update.
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
show more ...
|
#
4793d65d |
| 05-Jun-2015 |
Alexey Kardashevskiy <aik@ozlabs.ru> |
vfio: powerpc/spapr: powerpc/powernv/ioda: Define and implement DMA windows API
This extends iommu_table_group_ops by a set of callbacks to support dynamic DMA windows management.
create_table() cr
vfio: powerpc/spapr: powerpc/powernv/ioda: Define and implement DMA windows API
This extends iommu_table_group_ops by a set of callbacks to support dynamic DMA windows management.
create_table() creates a TCE table with specific parameters. it receives iommu_table_group to know nodeid in order to allocate TCE table memory closer to the PHB. The exact format of allocated multi-level table might be also specific to the PHB model (not the case now though). This callback calculated the DMA window offset on a PCI bus from @num and stores it in a just created table.
set_window() sets the window at specified TVT index + @num on PHB.
unset_window() unsets the window from specified TVT.
This adds a free() callback to iommu_table_ops to free the memory (potentially a tree of tables) allocated for the TCE table.
create_table() and free() are supposed to be called once per VFIO container and set_window()/unset_window() are supposed to be called for every group in a container.
This adds IOMMU capabilities to iommu_table_group such as default 32bit window parameters and others. This makes use of new values in vfio_iommu_spapr_tce. IODA1/P5IOC2 do not support DDW so they do not advertise pagemasks to the userspace.
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Acked-by: Alex Williamson <alex.williamson@redhat.com> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
show more ...
|
#
bbb845c4 |
| 05-Jun-2015 |
Alexey Kardashevskiy <aik@ozlabs.ru> |
powerpc/powernv: Implement multilevel TCE tables
TCE tables might get too big in case of 4K IOMMU pages and DDW enabled on huge guests (hundreds of GB of RAM) so the kernel might be unable to alloca
powerpc/powernv: Implement multilevel TCE tables
TCE tables might get too big in case of 4K IOMMU pages and DDW enabled on huge guests (hundreds of GB of RAM) so the kernel might be unable to allocate contiguous chunk of physical memory to store the TCE table.
To address this, POWER8 CPU (actually, IODA2) supports multi-level TCE tables, up to 5 levels which splits the table into a tree of smaller subtables.
This adds multi-level TCE tables support to pnv_pci_ioda2_table_alloc_pages() and pnv_pci_ioda2_table_free_pages() helpers.
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
show more ...
|
#
43cb60ab |
| 05-Jun-2015 |
Alexey Kardashevskiy <aik@ozlabs.ru> |
powerpc/powernv/ioda2: Introduce pnv_pci_ioda2_set_window
This is a part of moving DMA window programming to an iommu_ops callback. pnv_pci_ioda2_set_window() takes an iommu_table_group as a first p
powerpc/powernv/ioda2: Introduce pnv_pci_ioda2_set_window
This is a part of moving DMA window programming to an iommu_ops callback. pnv_pci_ioda2_set_window() takes an iommu_table_group as a first parameter (not pnv_ioda_pe) as it is going to be used as a callback for VFIO DDW code.
This should cause no behavioural change.
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
show more ...
|
#
aca6913f |
| 05-Jun-2015 |
Alexey Kardashevskiy <aik@ozlabs.ru> |
powerpc/powernv/ioda2: Introduce helpers to allocate TCE pages
This is a part of moving TCE table allocation into an iommu_ops callback to support multiple IOMMU groups per one VFIO container.
This
powerpc/powernv/ioda2: Introduce helpers to allocate TCE pages
This is a part of moving TCE table allocation into an iommu_ops callback to support multiple IOMMU groups per one VFIO container.
This moves the code which allocates the actual TCE tables to helpers: pnv_pci_ioda2_table_alloc_pages() and pnv_pci_ioda2_table_free_pages(). These do not allocate/free the iommu_table struct.
This enforces window size to be a power of two.
This should cause no behavioural change.
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Reviewed-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
show more ...
|
#
e5aad1e6 |
| 05-Jun-2015 |
Alexey Kardashevskiy <aik@ozlabs.ru> |
powerpc/powernv/ioda2: Rework iommu_table creation
This moves iommu_table creation to the beginning to make following changes easier to review. This starts using table parameters from the iommu_tabl
powerpc/powernv/ioda2: Rework iommu_table creation
This moves iommu_table creation to the beginning to make following changes easier to review. This starts using table parameters from the iommu_table struct.
This should cause no behavioural change.
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
show more ...
|
#
05c6cfb9 |
| 05-Jun-2015 |
Alexey Kardashevskiy <aik@ozlabs.ru> |
powerpc/iommu/powernv: Release replaced TCE
At the moment writing new TCE value to the IOMMU table fails with EBUSY if there is a valid entry already. However PAPR specification allows the guest to
powerpc/iommu/powernv: Release replaced TCE
At the moment writing new TCE value to the IOMMU table fails with EBUSY if there is a valid entry already. However PAPR specification allows the guest to write new TCE value without clearing it first.
Another problem this patch is addressing is the use of pool locks for external IOMMU users such as VFIO. The pool locks are to protect DMA page allocator rather than entries and since the host kernel does not control what pages are in use, there is no point in pool locks and exchange()+put_page(oldtce) is sufficient to avoid possible races.
This adds an exchange() callback to iommu_table_ops which does the same thing as set() plus it returns replaced TCE and DMA direction so the caller can release the pages afterwards. The exchange() receives a physical address unlike set() which receives linear mapping address; and returns a physical address as the clear() does.
This implements exchange() for P5IOC2/IODA/IODA2. This adds a requirement for a platform to have exchange() implemented in order to support VFIO.
This replaces iommu_tce_build() and iommu_clear_tce() with a single iommu_tce_xchg().
This makes sure that TCE permission bits are not set in TCE passed to IOMMU API as those are to be calculated by platform code from DMA direction.
This moves SetPageDirty() to the IOMMU code to make it work for both VFIO ioctl interface in in-kernel TCE acceleration (when it becomes available later).
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> [aw: for the vfio related changes] Acked-by: Alex Williamson <alex.williamson@redhat.com> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
show more ...
|
#
e57080f1 |
| 05-Jun-2015 |
Alexey Kardashevskiy <aik@ozlabs.ru> |
powerpc/powernv/ioda2: Add TCE invalidation for all attached groups
The iommu_table struct keeps a list of IOMMU groups it is used for. At the moment there is just a single group attached but furthe
powerpc/powernv/ioda2: Add TCE invalidation for all attached groups
The iommu_table struct keeps a list of IOMMU groups it is used for. At the moment there is just a single group attached but further patches will add TCE table sharing. When sharing is enabled, TCE cache in each PE needs to be invalidated so does the patch.
This does not change pnv_pci_ioda1_tce_invalidate() as there is no plan to enable TCE table sharing on PHBs older than IODA2.
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
show more ...
|
#
5780fb04 |
| 05-Jun-2015 |
Alexey Kardashevskiy <aik@ozlabs.ru> |
powerpc/powernv/ioda2: Move TCE kill register address to PE
At the moment the DMA setup code looks for the "ibm,opal-tce-kill" property which contains the TCE kill register address. Writing to this
powerpc/powernv/ioda2: Move TCE kill register address to PE
At the moment the DMA setup code looks for the "ibm,opal-tce-kill" property which contains the TCE kill register address. Writing to this register invalidates TCE cache on IODA/IODA2 hub.
This moves the register address from iommu_table to pnv_pnb as this register belongs to PHB and invalidates TCE cache for all tables of all attached PEs.
This moves the property reading/remapping code to a helper which is called when DMA is being configured for PE and which does DMA setup for both IODA1 and IODA2.
This adds a new pnv_pci_ioda2_tce_invalidate_entire() helper which invalidates cache for the entire table. It should be called after every call to opal_pci_map_pe_dma_window(). It was not required before because there was just a single TCE table and 64bit DMA was handled via bypass window (which has no table so no cache was used) but this is going to change with Dynamic DMA windows (DDW).
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
show more ...
|
#
f87a8864 |
| 05-Jun-2015 |
Alexey Kardashevskiy <aik@ozlabs.ru> |
vfio: powerpc/spapr/iommu/powernv/ioda2: Rework IOMMU ownership control
This adds tce_iommu_take_ownership() and tce_iommu_release_ownership which call in a loop iommu_take_ownership()/iommu_release
vfio: powerpc/spapr/iommu/powernv/ioda2: Rework IOMMU ownership control
This adds tce_iommu_take_ownership() and tce_iommu_release_ownership which call in a loop iommu_take_ownership()/iommu_release_ownership() for every table on the group. As there is just one now, no change in behaviour is expected.
At the moment the iommu_table struct has a set_bypass() which enables/ disables DMA bypass on IODA2 PHB. This is exposed to POWERPC IOMMU code which calls this callback when external IOMMU users such as VFIO are about to get over a PHB.
The set_bypass() callback is not really an iommu_table function but IOMMU/PE function. This introduces a iommu_table_group_ops struct and adds take_ownership()/release_ownership() callbacks to it which are called when an external user takes/releases control over the IOMMU.
This replaces set_bypass() with ownership callbacks as it is not necessarily just bypass enabling, it can be something else/more so let's give it more generic name.
The callbacks is implemented for IODA2 only. Other platforms (P5IOC2, IODA1) will use the old iommu_take_ownership/iommu_release_ownership API. The following patches will replace iommu_take_ownership/ iommu_release_ownership calls in IODA2 with full IOMMU table release/ create.
As we here and touching bypass control, this removes pnv_pci_ioda2_setup_bypass_pe() as it does not do much more compared to pnv_pci_ioda2_set_bypass. This moves tce_bypass_base initialization to pnv_pci_ioda2_setup_dma_pe.
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> [aw: for the vfio related changes] Acked-by: Alex Williamson <alex.williamson@redhat.com> Reviewed-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
show more ...
|
#
0eaf4def |
| 05-Jun-2015 |
Alexey Kardashevskiy <aik@ozlabs.ru> |
powerpc/spapr: vfio: Switch from iommu_table to new iommu_table_group
So far one TCE table could only be used by one IOMMU group. However IODA2 hardware allows programming the same TCE table address
powerpc/spapr: vfio: Switch from iommu_table to new iommu_table_group
So far one TCE table could only be used by one IOMMU group. However IODA2 hardware allows programming the same TCE table address to multiple PE allowing sharing tables.
This replaces a single pointer to a group in a iommu_table struct with a linked list of groups which provides the way of invalidating TCE cache for every PE when an actual TCE table is updated. This adds pnv_pci_link_table_and_group() and pnv_pci_unlink_table_and_group() helpers to manage the list. However without VFIO, it is still going to be a single IOMMU group per iommu_table.
This changes iommu_add_device() to add a device to a first group from the group list of a table as it is only called from the platform init code or PCI bus notifier and at these moments there is only one group per table.
This does not change TCE invalidation code to loop through all attached groups in order to simplify this patch and because it is not really needed in most cases. IODA2 is fixed in a later patch.
This should cause no behavioural change.
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> [aw: for the vfio related changes] Acked-by: Alex Williamson <alex.williamson@redhat.com> Reviewed-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
show more ...
|
#
b348aa65 |
| 05-Jun-2015 |
Alexey Kardashevskiy <aik@ozlabs.ru> |
powerpc/spapr: vfio: Replace iommu_table with iommu_table_group
Modern IBM POWERPC systems support multiple (currently two) TCE tables per IOMMU group (a.k.a. PE). This adds a iommu_table_group cont
powerpc/spapr: vfio: Replace iommu_table with iommu_table_group
Modern IBM POWERPC systems support multiple (currently two) TCE tables per IOMMU group (a.k.a. PE). This adds a iommu_table_group container for TCE tables. Right now just one table is supported.
This defines iommu_table_group struct which stores pointers to iommu_group and iommu_table(s). This replaces iommu_table with iommu_table_group where iommu_table was used to identify a group: - iommu_register_group(); - iommudata of generic iommu_group;
This removes @data from iommu_table as it_table_group provides same access to pnv_ioda_pe.
For IODA, instead of embedding iommu_table, the new iommu_table_group keeps pointers to those. The iommu_table structs are allocated dynamically.
For P5IOC2, both iommu_table_group and iommu_table are embedded into PE struct. As there is no EEH and SRIOV support for P5IOC2, iommu_free_table() should not be called on iommu_table struct pointers so we can keep it embedded in pnv_phb::p5ioc2.
For pSeries, this replaces multiple calls of kzalloc_node() with a new iommu_pseries_alloc_group() helper and stores the table group struct pointer into the pci_dn struct. For release, a iommu_table_free_group() helper is added.
This moves iommu_table struct allocation from SR-IOV code to the generic DMA initialization code in pnv_pci_ioda_setup_dma_pe and pnv_pci_ioda2_setup_dma_pe as this is where DMA is actually initialized. This change is here because those lines had to be changed anyway.
This should cause no behavioural change.
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> [aw: for the vfio related changes] Acked-by: Alex Williamson <alex.williamson@redhat.com> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
show more ...
|
#
decbda25 |
| 05-Jun-2015 |
Alexey Kardashevskiy <aik@ozlabs.ru> |
powerpc/powernv/ioda/ioda2: Rework TCE invalidation in tce_build()/tce_free()
The pnv_pci_ioda_tce_invalidate() helper invalidates TCE cache. It is supposed to be called on IODA1/2 and not called on
powerpc/powernv/ioda/ioda2: Rework TCE invalidation in tce_build()/tce_free()
The pnv_pci_ioda_tce_invalidate() helper invalidates TCE cache. It is supposed to be called on IODA1/2 and not called on p5ioc2. It receives start and end host addresses of TCE table.
IODA2 actually needs PCI addresses to invalidate the cache. Those can be calculated from host addresses but since we are going to implement multi-level TCE tables, calculating PCI address from a host address might get either tricky or ugly as TCE table remains flat on PCI bus but not in RAM.
This moves pnv_pci_ioda_tce_invalidate() from generic pnv_tce_build/ pnt_tce_free and defines IODA1/2-specific callbacks which call generic ones and do PHB-model-specific TCE cache invalidation. P5IOC2 keeps using generic callbacks as before.
This changes pnv_pci_ioda2_tce_invalidate() to receives TCE index and number of pages which are PCI addresses shifted by IOMMU page shift.
No change in behaviour is expected.
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
show more ...
|
#
da004c36 |
| 05-Jun-2015 |
Alexey Kardashevskiy <aik@ozlabs.ru> |
powerpc/iommu: Move tce_xxx callbacks from ppc_md to iommu_table
This adds a iommu_table_ops struct and puts pointer to it into the iommu_table struct. This moves tce_build/tce_free/tce_get/tce_flus
powerpc/iommu: Move tce_xxx callbacks from ppc_md to iommu_table
This adds a iommu_table_ops struct and puts pointer to it into the iommu_table struct. This moves tce_build/tce_free/tce_get/tce_flush callbacks from ppc_md to the new struct where they really belong to.
This adds the requirement for @it_ops to be initialized before calling iommu_init_table() to make sure that we do not leave any IOMMU table with iommu_table_ops uninitialized. This is not a parameter of iommu_init_table() though as there will be cases when iommu_init_table() will not be called on TCE tables, for example - VFIO.
This does s/tce_build/set/, s/tce_free/clear/ and removes "tce_" redundant prefixes.
This removes tce_xxx_rm handlers from ppc_md but does not add them to iommu_table_ops as this will be done later if we decide to support TCE hypercalls in real mode. This removes _vm callbacks as only virtual mode is supported by now so this also removes @rm parameter.
For pSeries, this always uses tce_buildmulti_pSeriesLP/ tce_buildmulti_pSeriesLP. This changes multi callback to fall back to tce_build_pSeriesLP/tce_free_pSeriesLP if FW_FEATURE_MULTITCE is not present. The reason for this is we still have to support "multitce=off" boot parameter in disable_multitce() and we do not want to walk through all IOMMU tables in the system and replace "multi" callbacks with single ones.
For powernv, this defines _ops per PHB type which are P5IOC2/IODA1/IODA2. This makes the callbacks for them public. Later patches will extend callbacks for IODA1/2.
No change in behaviour is expected.
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
show more ...
|
#
ac9a5889 |
| 05-Jun-2015 |
Alexey Kardashevskiy <aik@ozlabs.ru> |
powerpc/iommu: Put IOMMU group explicitly
So far an iommu_table lifetime was the same as PE. Dynamic DMA windows will change this and iommu_free_table() will not always require the group to be relea
powerpc/iommu: Put IOMMU group explicitly
So far an iommu_table lifetime was the same as PE. Dynamic DMA windows will change this and iommu_free_table() will not always require the group to be released.
This moves iommu_group_put() out of iommu_free_table().
This adds a iommu_pseries_free_table() helper which does iommu_group_put() and iommu_free_table(). Later it will be changed to receive a table_group and we will have to change less lines then.
This should cause no behavioural change.
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Reviewed-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
show more ...
|
#
c5773822 |
| 05-Jun-2015 |
Alexey Kardashevskiy <aik@ozlabs.ru> |
powerpc/powernv/ioda: Clean up IOMMU group registration
The existing code has 3 calls to iommu_register_group() and all 3 branches actually cover all possible cases.
This replaces 3 calls with one
powerpc/powernv/ioda: Clean up IOMMU group registration
The existing code has 3 calls to iommu_register_group() and all 3 branches actually cover all possible cases.
This replaces 3 calls with one and moves the registration earlier; the latter will make more sense when we add TCE table sharing.
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Reviewed-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
show more ...
|
#
4617082e |
| 05-Jun-2015 |
Alexey Kardashevskiy <aik@ozlabs.ru> |
powerpc/iommu/powernv: Get rid of set_iommu_table_base_and_group
The set_iommu_table_base_and_group() name suggests that the function sets table base and add a device to an IOMMU group.
The actual
powerpc/iommu/powernv: Get rid of set_iommu_table_base_and_group
The set_iommu_table_base_and_group() name suggests that the function sets table base and add a device to an IOMMU group.
The actual purpose for table base setting is to put some reference into a device so later iommu_add_device() can get the IOMMU group reference and the device to the group.
At the moment a group cannot be explicitly passed to iommu_add_device() as we want it to work from the bus notifier, we can fix it later and remove confusing calls of set_iommu_table_base().
This replaces set_iommu_table_base_and_group() with a couple of set_iommu_table_base() + iommu_add_device() which makes reading the code easier.
This adds few comments why set_iommu_table_base() and iommu_add_device() are called where they are called.
For IODA1/2, this essentially removes iommu_add_device() call from the pnv_pci_ioda_dma_dev_setup() as it will always fail at this particular place: - for physical PE, the device is already attached by iommu_add_device() in pnv_pci_ioda_setup_dma_pe(); - for virtual PE, the sysfs entries are not ready to create all symlinks so actual adding is happening in tce_iommu_bus_notifier.
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Reviewed-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
show more ...
|