Re: [REGRESSION 5.19.x] AMD HD-audio devices missing on 5.19

From: Takashi Iwai
Date: Wed Sep 07 2022 - 09:28:38 EST


On Tue, 23 Aug 2022 22:28:24 +0200,
Jason Gunthorpe wrote:
>
> On Tue, Aug 23, 2022 at 01:46:36PM +0200, Takashi Iwai wrote:
> > It was tested now and confirmed that the call path is via AMDGPU, as
> > expected:
> > amdgpu_pci_probe ->
> > amdgpu_driver_load_kms ->
> > amdgpu_device_init ->
> > amdgpu_amdkfd_device_init ->
> > kgd2kfd_device_init ->
> > kgd2kfd_resume_iommu ->
> > kfd_iommu_resume ->
> > amd_iommu_init_device ->
> > iommu_attach_group ->
> > __iommu_attach_group
>
> Oh, when you said sound intel I thought this was an Intel CPU..
>
> Yes, there is this hacky private path from the amdgpu to
> the amd iommu driver that makes a mess of it here. We discussed it in
> this thread:
>
> https://lore.kernel.org/linux-iommu/YgtuJQhY8SNlv9%2F6@xxxxxxxxxx/
>
> But nobody put it together that it would be a problem with this.
>
> Something like this, perhaps, but I didn't check if overriding the
> type would cause other problems.
>
> diff --git a/drivers/iommu/amd/iommu_v2.c b/drivers/iommu/amd/iommu_v2.c
> index 696d5555be5794..6a1f02c62dffcc 100644
> --- a/drivers/iommu/amd/iommu_v2.c
> +++ b/drivers/iommu/amd/iommu_v2.c
> @@ -777,6 +777,8 @@ int amd_iommu_init_device(struct pci_dev *pdev, int pasids)
> if (dev_state->domain == NULL)
> goto out_free_states;
>
> + /* See iommu_is_default_domain() */
> + dev_state->domain->type = IOMMU_DOMAIN_IDENTITY;
> amd_iommu_domain_direct_map(dev_state->domain);
>
> ret = amd_iommu_domain_enable_v2(dev_state->domain, pasids);
> diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
> index 780fb70715770d..fe8bd17f52314b 100644
> --- a/drivers/iommu/iommu.c
> +++ b/drivers/iommu/iommu.c
> @@ -3076,6 +3076,24 @@ static ssize_t iommu_group_store_type(struct iommu_group *group,
> return ret;
> }
>
> +static bool iommu_is_default_domain(struct iommu_group *group)
> +{
> + if (group->domain == group->default_domain)
> + return true;
> +
> + /*
> + * If the default domain was set to identity and it is still an identity
> + * domain then we consider this a pass. This happens because of
> + * amd_iommu_init_device() replacing the default idenity domain with an
> + * identity domain that has a different configuration for AMDGPU.
> + */
> + if (group->default_domain &&
> + group->default_domain->type == IOMMU_DOMAIN_IDENTITY &&
> + group->domain && group->domain->type == IOMMU_DOMAIN_IDENTITY)
> + return true;
> + return false;
> +}
> +
> /**
> * iommu_device_use_default_domain() - Device driver wants to handle device
> * DMA through the kernel DMA API.
> @@ -3094,8 +3112,7 @@ int iommu_device_use_default_domain(struct device *dev)
>
> mutex_lock(&group->mutex);
> if (group->owner_cnt) {
> - if (group->domain != group->default_domain ||
> - group->owner) {
> + if (group->owner || iommu_is_default_domain(group)) {

Isn't this rather
if (group->owner || !iommu_is_default_domain(group)) {
?

I'll rebuild the kernel with this change and ask reporters again.


Takashi