Re: [patch V2 09/21] genirq/msi: Make MSI descriptor iterators device domain aware

From: Marc Zyngier
Date: Thu Nov 24 2022 - 10:46:14 EST


On Mon, 21 Nov 2022 14:36:29 +0000,
Thomas Gleixner <tglx@xxxxxxxxxxxxx> wrote:
>
> To support multiple MSI interrupt domains per device it is necessary to
> segment the xarray MSI descriptor storage. Each domain gets up to
> MSI_MAX_INDEX entries.
>
> Change the iterators so they operate with domain ids and take the domain
> offsets into account.
>
> The publicly available iterators which are mostly used in legacy
> implementations and the PCI/MSI core default to MSI_DEFAULT_DOMAIN (0)
> which is the id for the existing "global" domains.
>
> No functional change.
>
> Signed-off-by: Thomas Gleixner <tglx@xxxxxxxxxxxxx>
> ---
> V2: Fix the off by one so the index space is including MSI_MAX_INDEX (Kevin)
> ---
> include/linux/msi.h | 45 +++++++++++++++++++++++++++++++++++++++++----
> kernel/irq/msi.c | 43 +++++++++++++++++++++++++++++++++++--------
> 2 files changed, 76 insertions(+), 12 deletions(-)
>
> --- a/include/linux/msi.h
> +++ b/include/linux/msi.h
> @@ -181,6 +181,7 @@ enum msi_desc_filter {
> * @mutex: Mutex protecting the MSI descriptor store
> * @__store: Xarray for storing MSI descriptor pointers
> * @__iter_idx: Index to search the next entry for iterators
> + * @__iter_max: Index to limit the search
> * @__irqdomains: Per device interrupt domains
> */
> struct msi_device_data {
> @@ -189,6 +190,7 @@ struct msi_device_data {
> struct mutex mutex;
> struct xarray __store;
> unsigned long __iter_idx;
> + unsigned long __iter_max;
> struct irq_domain *__irqdomains[MSI_MAX_DEVICE_IRQDOMAINS];
> };
>
> @@ -197,14 +199,34 @@ int msi_setup_device_data(struct device
> void msi_lock_descs(struct device *dev);
> void msi_unlock_descs(struct device *dev);
>
> -struct msi_desc *msi_first_desc(struct device *dev, enum msi_desc_filter filter);
> +struct msi_desc *msi_domain_first_desc(struct device *dev, unsigned int domid,
> + enum msi_desc_filter filter);
> +
> +/**
> + * msi_first_desc - Get the first MSI descriptor of the default irqdomain
> + * @dev: Device to operate on
> + * @filter: Descriptor state filter
> + *
> + * Must be called with the MSI descriptor mutex held, i.e. msi_lock_descs()
> + * must be invoked before the call.
> + *
> + * Return: Pointer to the first MSI descriptor matching the search
> + * criteria, NULL if none found.
> + */
> +static inline struct msi_desc *msi_first_desc(struct device *dev,
> + enum msi_desc_filter filter)
> +{
> + return msi_domain_first_desc(dev, MSI_DEFAULT_DOMAIN, filter);
> +}
> +
> struct msi_desc *msi_next_desc(struct device *dev, enum msi_desc_filter filter);
>
> /**
> - * msi_for_each_desc - Iterate the MSI descriptors
> + * msi_domain_for_each_desc - Iterate the MSI descriptors in a specific domain
> *
> * @desc: struct msi_desc pointer used as iterator
> * @dev: struct device pointer - device to iterate
> + * @domid: The id of the interrupt domain which should be walked.
> * @filter: Filter for descriptor selection
> *
> * Notes:
> @@ -212,10 +234,25 @@ struct msi_desc *msi_next_desc(struct de
> * pair.
> * - It is safe to remove a retrieved MSI descriptor in the loop.
> */
> -#define msi_for_each_desc(desc, dev, filter) \
> - for ((desc) = msi_first_desc((dev), (filter)); (desc); \
> +#define msi_domain_for_each_desc(desc, dev, domid, filter) \
> + for ((desc) = msi_domain_first_desc((dev), (domid), (filter)); (desc); \
> (desc) = msi_next_desc((dev), (filter)))
>
> +/**
> + * msi_for_each_desc - Iterate the MSI descriptors in the default irqdomain
> + *
> + * @desc: struct msi_desc pointer used as iterator
> + * @dev: struct device pointer - device to iterate
> + * @filter: Filter for descriptor selection
> + *
> + * Notes:
> + * - The loop must be protected with a msi_lock_descs()/msi_unlock_descs()
> + * pair.
> + * - It is safe to remove a retrieved MSI descriptor in the loop.
> + */
> +#define msi_for_each_desc(desc, dev, filter) \
> + msi_domain_for_each_desc((desc), (dev), MSI_DEFAULT_DOMAIN, (filter))
> +
> #define msi_desc_to_dev(desc) ((desc)->dev)
>
> #ifdef CONFIG_IRQ_MSI_IOMMU
> --- a/kernel/irq/msi.c
> +++ b/kernel/irq/msi.c
> @@ -21,6 +21,10 @@
>
> static inline int msi_sysfs_create_group(struct device *dev);
>
> +/* Invalid XA index which is outside of any searchable range */
> +#define MSI_XA_MAX_INDEX (ULONG_MAX - 1)
> +#define MSI_XA_DOMAIN_SIZE (MSI_MAX_INDEX + 1)
> +
> static inline void msi_setup_default_irqdomain(struct device *dev, struct msi_device_data *md)
> {
> if (!dev->msi.domain)
> @@ -33,6 +37,20 @@ static inline void msi_setup_default_irq
> md->__irqdomains[MSI_DEFAULT_DOMAIN] = dev->msi.domain;
> }
>
> +static int msi_get_domain_base_index(struct device *dev, unsigned int domid)
> +{
> + lockdep_assert_held(&dev->msi.data->mutex);
> +
> + if (WARN_ON_ONCE(domid >= MSI_MAX_DEVICE_IRQDOMAINS))
> + return -ENODEV;
> +
> + if (WARN_ON_ONCE(!dev->msi.data->__irqdomains[domid]))
> + return -ENODEV;
> +
> + return domid * MSI_XA_DOMAIN_SIZE;
> +}

So what I understand of this is that we split the index space into
segments, one per msi_domain_ids, MSI_XA_DOMAIN_SIZE apart.

Why didn't you decide to go all the way and have one xarray per
irqdomain? It's not that big a structure, and it would make the whole
thing a bit more straightforward.

Or do you anticipate cases where you'd walk the __store xarray across
irqdomains?

Thanks,

M.

--
Without deviation from the norm, progress is not possible.