Re: [PART2 PATCH v4 07/11] iommu/amd: Introduce amd_iommu_update_ga()

From: Radim KrÄmÃÅ
Date: Wed Jul 13 2016 - 10:15:14 EST


[I pasted v3 reviews prefixed with a pipe where I think they still apply.]

2016-07-13 08:20-0500, Suravee Suthikulpanit:
> From: Suravee Suthikulpanit <Suravee.Suthikulpanit@xxxxxxx>
>
> Introduces a new IOMMU API, amd_iommu_update_ga(), which allows
> KVM (SVM) to update existing posted interrupt IOMMU IRTE when
> load/unload vcpu.
>
> Signed-off-by: Suravee Suthikulpanit <Suravee.Suthikulpanit@xxxxxxx>
> ---
> diff --git a/drivers/iommu/amd_iommu.c b/drivers/iommu/amd_iommu.c
> @@ -4461,4 +4461,69 @@ int amd_iommu_create_irq_domain(struct amd_iommu *iommu)
> +int amd_iommu_update_ga(u32 vcpu_id, u32 cpu, u32 vm_id,
> + u64 base, bool is_run)

|2016-07-13 15:49+0700, Suravee Suthikulpanit:
|> On 07/12/2016 01:59 AM, Radim KrÄmÃÅ wrote:
|>> Not just in this function does the interface between svm and iommu split
|>> ga_tag into its two components (vcpu_id and ga_tag), but it seems that
|>> the combined value could always be used instead ...
|>> Is there an advantage to passing two values?
|>
|> Here, the amd_iommu_update_ga() takes the two separate value for input
|> parameters. Mainly the ga_tag (which is really the vm_id) and vcpu_id. This
|> allow IOMMU driver to decide how to encode the GATAG to be programmed into
|> the IRTE. Currently, the actual GATAG is a 16-bit value, <vm_id><vcpu_id>.
|> This keeps the interface independent from how we encode the GATAG.

I was thinking about making the IOMMU unaware how SVM or other Linux
hypervisors use the ga_tag, i.e. passing the final u32 ga_tag.

For example 32 bit hypervisor doesn't need to use lookup, because any
pointer can used as the ga_tag directly. And there are other viable
algoritms for assigning the ga_tag -- why isn't the vm_id 24 bits?

> + unsigned long flags;
> + struct amd_iommu *iommu;
> +
> + if (!AMD_IOMMU_GUEST_IR_VAPIC(amd_iommu_guest_ir))
> + return 0;
> +
> + for_each_iommu(iommu) {
> + struct amd_ir_data *ir_data;
> +
> + spin_lock_irqsave(&iommu->gatag_ir_hash_lock, flags);
> +
> + /* Note:
> + * We need to update all interrupt remapping table entries
> + * for targeting the specified vcpu. Here, we use gatag
> + * as a hash key and iterate through all entries in the bucket.
> + */
> + hash_for_each_possible(iommu->gatag_ir_hash, ir_data, hnode,
> + AMD_IOMMU_GATAG(vm_id, vcpu_id)) {
> + struct irte_ga *irte = (struct irte_ga *) ir_data->entry;

|>> (The ga_tag check is missing here too.)
|>
|> Here, the intention is to update all interrupt remapping entries in the
|> bucket w/ the same GATAG (i.e. vm_id + vcpu_id), where GATAG =
|> AMD_IOMMU_GATAG(vm_id, vcpu_id).

Which is why you need to check that
AMD_IOMMU_GATAG(vm_id, vcpu_id) == entry->fields_vapic.ga_tag

The hashing function can map two different vm_id + vcpu_id to the same
bucket and hash_for_each_possible() would return both of them, but only
one belongs to the VCPU that we want to update.

(And shouldn't there be only one match?)

> +
> + if (!irte->lo.fields_vapic.guest_mode)
> + continue;
> +
> + update_irte_ga((struct irte_ga *)ir_data->ref,
> + ir_data->irq_2_irte.devid,
> + base, cpu, is_run);

|>> (The lookup leading up to here is avoidable -- svm, the caller, has the
|>> ability to map ga_tag into irte/ir_data directly with a pointer.
|>> I'm not sure if the lookup is slow enough to pardon optimization, but
|>> it might make the code simpler as well.)
|>
|> I might have mislead you up to this point. Not sure if the assumption here
|> still hold with my explanation above. Sorry for confusion.

SVM configures IOMMU with ga_tag, so IOMMU could return the pointer to
ir_data/irte that was just configured. SVM would couple it with a VCPU
(and hence a ga_tag) and when amd_iommu_update_ga() was needed, SVM
would pass the ir_data/irte pointer directly, instead of looking it up
though a ga_tag.