RE: [PATCH v2 02/15] iommu: Report domain nesting info

From: Tian, Kevin
Date: Sun Jun 14 2020 - 21:22:44 EST


> From: Liu, Yi L <yi.l.liu@xxxxxxxxx>
> Sent: Friday, June 12, 2020 5:05 PM
>
> Hi Alex,
>
> > From: Alex Williamson <alex.williamson@xxxxxxxxxx>
> > Sent: Friday, June 12, 2020 3:30 AM
> >
> > On Thu, 11 Jun 2020 05:15:21 -0700
> > Liu Yi L <yi.l.liu@xxxxxxxxx> wrote:
> >
> > > IOMMUs that support nesting translation needs report the capability
> > > info to userspace, e.g. the format of first level/stage paging structures.
> > >
> > > Cc: Kevin Tian <kevin.tian@xxxxxxxxx>
> > > CC: Jacob Pan <jacob.jun.pan@xxxxxxxxxxxxxxx>
> > > Cc: Alex Williamson <alex.williamson@xxxxxxxxxx>
> > > Cc: Eric Auger <eric.auger@xxxxxxxxxx>
> > > Cc: Jean-Philippe Brucker <jean-philippe@xxxxxxxxxx>
> > > Cc: Joerg Roedel <joro@xxxxxxxxxx>
> > > Cc: Lu Baolu <baolu.lu@xxxxxxxxxxxxxxx>
> > > Signed-off-by: Liu Yi L <yi.l.liu@xxxxxxxxx>
> > > Signed-off-by: Jacob Pan <jacob.jun.pan@xxxxxxxxxxxxxxx>
> > > ---
> > > @Jean, Eric: as nesting was introduced for ARM, but looks like no
> > > actual user of it. right? So I'm wondering if we can reuse
> > > DOMAIN_ATTR_NESTING to retrieve nesting info? how about your
> opinions?
> > >
> > > include/linux/iommu.h | 1 +
> > > include/uapi/linux/iommu.h | 34
> ++++++++++++++++++++++++++++++++++
> > > 2 files changed, 35 insertions(+)
> > >
> > > diff --git a/include/linux/iommu.h b/include/linux/iommu.h index
> > > 78a26ae..f6e4b49 100644
> > > --- a/include/linux/iommu.h
> > > +++ b/include/linux/iommu.h
> > > @@ -126,6 +126,7 @@ enum iommu_attr {
> > > DOMAIN_ATTR_FSL_PAMUV1,
> > > DOMAIN_ATTR_NESTING, /* two stages of translation */
> > > DOMAIN_ATTR_DMA_USE_FLUSH_QUEUE,
> > > + DOMAIN_ATTR_NESTING_INFO,
> > > DOMAIN_ATTR_MAX,
> > > };
> > >
> > > diff --git a/include/uapi/linux/iommu.h b/include/uapi/linux/iommu.h
> > > index 303f148..02eac73 100644
> > > --- a/include/uapi/linux/iommu.h
> > > +++ b/include/uapi/linux/iommu.h
> > > @@ -332,4 +332,38 @@ struct iommu_gpasid_bind_data {
> > > };
> > > };
> > >
> > > +struct iommu_nesting_info {
> > > + __u32 size;
> > > + __u32 format;
> > > + __u32 features;
> > > +#define IOMMU_NESTING_FEAT_SYSWIDE_PASID (1 << 0)
> > > +#define IOMMU_NESTING_FEAT_BIND_PGTBL (1 << 1)
> > > +#define IOMMU_NESTING_FEAT_CACHE_INVLD (1 << 2)
> > > + __u32 flags;
> > > + __u8 data[];
> > > +};
> > > +
> > > +/*
> > > + * @flags: VT-d specific flags. Currently reserved for future
> > > + * extension.
> > > + * @addr_width: The output addr width of first level/stage translation
> > > + * @pasid_bits: Maximum supported PASID bits, 0 represents no
> PASID
> > > + * support.
> > > + * @cap_reg: Describe basic capabilities as defined in VT-d
> capability
> > > + * register.
> > > + * @cap_mask: Mark valid capability bits in @cap_reg.
> > > + * @ecap_reg: Describe the extended capabilities as defined in VT-d
> > > + * extended capability register.
> > > + * @ecap_mask: Mark the valid capability bits in @ecap_reg.
> >
> > Please explain this a little further, why do we need to tell userspace about
> > cap/ecap register bits that aren't valid through this interface?
> > Thanks,
>
> we only want to tell userspace about the bits marked in the cap/ecap_mask.
> cap/ecap_mask is kind of white-list of the cap/ecap register. userspace
> should
> only care about the bits in the white-list, for other bits, it should ignore.
>
> Regards,
> Yi Liu

For invalid bits if kernel just clears them then do we still need additional
mask bits to explicitly mark them out? I guess this might be the point that
Alex asked...

>
> > Alex
> >
> >
> > > + */
> > > +struct iommu_nesting_info_vtd {
> > > + __u32 flags;
> > > + __u16 addr_width;
> > > + __u16 pasid_bits;
> > > + __u64 cap_reg;
> > > + __u64 cap_mask;
> > > + __u64 ecap_reg;
> > > + __u64 ecap_mask;
> > > +};
> > > +
> > > #endif /* _UAPI_IOMMU_H */