Re: [PATCH v5 01/23] iommu: introduce bind_pasid_table API function

From: Jacob Pan
Date: Tue Aug 28 2018 - 13:02:12 EST


On Fri, 24 Aug 2018 15:20:08 +0200
Auger Eric <eric.auger@xxxxxxxxxx> wrote:

> Hi Yi Liu,
>
> On 08/24/2018 02:47 PM, Liu, Yi L wrote:
> > Hi Eric,
> >
> >> From: iommu-bounces@xxxxxxxxxxxxxxxxxxxxxxxxxx [mailto:iommu-
> >> bounces@xxxxxxxxxxxxxxxxxxxxxxxxxx] On Behalf Of Auger Eric
> >> Sent: Friday, August 24, 2018 12:35 AM
> >>
> >> Hi Jacob,
> >>
> >> On 05/11/2018 10:53 PM, Jacob Pan wrote:
> >>> Virtual IOMMU was proposed to support Shared Virtual Memory (SVM)
> >>> use in the guest:
> >>> https://lists.gnu.org/archive/html/qemu-devel/2016-11/msg05311.html
> >>>
> >>> As part of the proposed architecture, when an SVM capable PCI
> >>> device is assigned to a guest, nested mode is turned on. Guest
> >>> owns the first level page tables (request with PASID) which
> >>> performs GVA->GPA translation. Second level page tables are owned
> >>> by the host for GPA->HPA translation for both request with and
> >>> without PASID.
> >>>
> >>> A new IOMMU driver interface is therefore needed to perform tasks
> >>> as follows:
> >>> * Enable nested translation and appropriate translation type
> >>> * Assign guest PASID table pointer (in GPA) and size to host IOMMU
> >>>
> >>> This patch introduces new API functions to perform bind/unbind
> >>> guest PASID tables. Based on common data, model specific IOMMU
> >>> drivers can be extended to perform the specific steps for binding
> >>> pasid table of assigned devices.
> >>>
> >>> Signed-off-by: Jean-Philippe Brucker
> >>> <jean-philippe.brucker@xxxxxxx> Signed-off-by: Liu, Yi L
> >>> <yi.l.liu@xxxxxxxxxxxxxxx> Signed-off-by: Ashok Raj
> >>> <ashok.raj@xxxxxxxxx> Signed-off-by: Jacob Pan
> >>> <jacob.jun.pan@xxxxxxxxxxxxxxx> ---
> >
> > [...]
> >
> >>> +#ifndef _UAPI_IOMMU_H
> >>> +#define _UAPI_IOMMU_H
> >>> +
> >>> +#include <linux/types.h>
> >>> +
> >>> +/**
> >>> + * PASID table data used to bind guest PASID table to the host
> >>> IOMMU. This will
> >>> + * enable guest managed first level page tables.
> >>> + * @version: for future extensions and identification of the
> >>> data format
> >>> + * @bytes: size of this structure
> >>> + * @base_ptr: PASID table pointer
> >>> + * @pasid_bits: number of bits supported in the guest
> >>> PASID table, must be
> >> less
> >>> + * or equal than the host supported PASID size.
> >>> + */
> >>> +struct pasid_table_config {
> >>> + __u32 version;
> >>> +#define PASID_TABLE_CFG_VERSION_1 1
> >>> + __u32 bytes;
> >>> + __u64 base_ptr;
> >>> + __u8 pasid_bits;
> >>
> >> As reported in "[RFC 00/13] SMMUv3 Nested Stage Setup" thread,
> >> this API could be used for ARM SMMUv3 nested stage enablement
> >> without many changes. Assuming SMMUv3 nested stage is confirmed to
> >> be interesting for vendors and maintainers, we could try to unify
> >> the APIs.
> >
> > Just a quick question on nested stage on SMMUv3. If virtualizer
> > wants to enable nested stage on SMMUv3, does it link the whole
> > guest CD table to host or do it in other manner?
> Yes that's correct. On ARM SMMUv3 you have Stream Table Entries (STEs,
> indexed by ReqID=streamid). If stage 1 is used, the STE points to 1 or
> more contiguous Context Descriptors (CDs).
> So STE looks like the VTD Context-Entry and CD table looks like the
> VTD PASID table as far as I understand.
> >
> >> As far as I understand the VTD PASID table is equivalent to the ARM
> >> SMMUv3 context descriptor table (CD). This corresponds to the
> >> stage 1 context table with one or more entries, each corresponding
> >> to one PASID.
> >
> > PASID table is index by PASID, and have multiple entries. A PASID
> > table would have 2^PASID_BITS entries.
> On ARM SMMUv3 the number of CDs is 2 ^STE.S1CDMax.
> >
> >> maybe using the s1ctx_table_config terminology instead of
> >> pasid_table_config would be more generic, the pasid table being
> >> Intel naming.
> >>
> >> on top of pasid_bits, I think an "asid_bits" field may be needed
> >> too. The guest IOMMU might support a different number of asid bits
> >> from the host one.
> >
> > Maybe needed for SMMUv3. I've noticed you've placed it in
> > struct iommu_smmu_s1_config.
> >
> >>
> >> Although without having skimmed through the whole series yet, I
> >> wonder how you handle the case where stage1 is bypassed or
> >> disabled? The guest may define the S1 context entries but bypass
> >> or abort stage 1 translations globally. Looks something missing to
> >> me at first sight.
> >
> > Sorry, I didn't quite follow here. What usage is case such for?
> > like stage 1 is bypassed or disabled. IOVA or SVA?
> Each STE entry has a config field which tells how S1 and S2 behave
>
> Options are no traffic at all or any combination of the following:
>
> S1 S2
> bypass bypass
> transl bypass
> bypass transl
> transl transl
>
> host manages S2 info. guest sets S1 related fields.
>
> To me the guest SET.Config should be passed to the host so that this
> latter writes the correct global Config field value in the STE,
> including S1 + S2 info.
>
Global config ( VT-d global command reg) is IOMMU wide, we cannot let
guest config change to directly modify global settings. I think it is
up to the vIOMMU emulation code to unbind guest PASID table thus
disable S1, if the guest is setting S1 to bypass/disabled.

I am still perplexed by valid use cases of S1 bypass, to me it means no
SVA nor guest IOVA which means no need for vIOMMU.

> Thanks
>
> Eric
> >
> > Thanks,
> > Yi Liu
> >

[Jacob Pan]