RE: [EXTERNAL] Re: [PATCH v3 2/2] arm64: PCI: hv: Add support for Hyper-V vPCI

From: Sunil Muthuswamy
Date: Tue Nov 09 2021 - 14:59:46 EST


On Sunday, October 24, 2021 5:55 AM,
Marc Zyngier <maz@xxxxxxxxxx> wrote:

> > From: Sunil Muthuswamy <sunilmut@xxxxxxxxxxxxx>
> >
> > Add support for Hyper-V vPCI for ARM64 by implementing the arch specific
> > interfaces. Introduce an IRQ domain and chip specific to Hyper-v vPCI that
> > is based on SPIs. The IRQ domain parents itself to the arch GIC IRQ domain
> > for basic vector management.
> >
> > Signed-off-by: Sunil Muthuswamy <sunilmut@xxxxxxxxxxxxx>
> > ---
> > In v2 & v3:
> > Changes are described in the cover letter.
> >
> > +unsigned int hv_msi_get_int_vector(struct irq_data *irqd)
> > +{
> > + irqd = irq_domain_get_irq_data(hv_msi_gic_irq_domain, irqd->irq);
> > +
> > + return irqd->hwirq;
>
> Really??? Why isn't this just:
>
> return irqd->parent_data->hwirq;
>
> instead of reparsing the whole hierarchy?

Thanks, getting addressed in v4.

> > +static int hv_pci_vec_irq_domain_activate(struct irq_domain *domain,
> > + struct irq_data *irqd, bool reserve)
> > +{
> > + /* All available online CPUs are available for targeting */
> > + irq_data_update_effective_affinity(irqd, cpu_online_mask);
>
> This looks odd. Linux doesn't use 1:N distribution with the GIC, so
> the effective affinity of the interrupt never targets all CPUs.
> Specially considering that the first irq_set_affinity() call is going
> to reset it to something more realistic.
>
> I don't think you should have this at all, but I also suspect that you
> are playing all sort of games behind the scenes.

Thanks for the '1:N' comment. The reason for having this is that Hyper-V
vPCI compose MSI msg code (i.e. 'hv_compose_msi_msg') needs to have
some IRQ affinity to pass to the hypervisor. For x86, the 'x86_vector_domain'
takes care of that in the 'x86_vector_activate' call. But, GIC v3 doesn't
implement a '.activate' callback and so at the time of the MSI composition
there is no affinity associated with the IRQ, which causes the Hyper-V
MSI compose message to fail. The idea for doing the above was to have
a temporary affinity in place to satisfy the MSI compose message until
the GIC resets the affinity to something real. And, when the GIC will
reset the affinity, the 'unmask' callback will cause the Hyper-V vPCI code
to retarget the interrupt to the 'real' cpu.

In v4, I am changing the ' hv_pci_vec_irq_domain_activate' callback to
pick a cpu for affinity in a round-robin fashion. That will stay in affect
until the GIC will set the right affinity and the vector will get retargeted.

> > +
> > + return 0;
> > +}
> > +
> > +static const struct irq_domain_ops hv_pci_domain_ops = {
> > + .alloc = hv_pci_vec_irq_domain_alloc,
> > + .free = hv_pci_vec_irq_domain_free,
> > + .activate = hv_pci_vec_irq_domain_activate,
> > +};
> > +
> > +int hv_pci_irqchip_init(struct irq_domain **parent_domain,
> > + bool *fasteoi_handler,
> > + u8 *delivery_mode)
> > +{
> > + static struct hv_pci_chip_data *chip_data;
> > + struct fwnode_handle *fn = NULL;
> > + int ret = -ENOMEM;
> > +
> > + chip_data = kzalloc(sizeof(*chip_data), GFP_KERNEL);
> > + if (!chip_data)
> > + return ret;
> > +
> > + mutex_init(&chip_data->map_lock);
> > + fn = irq_domain_alloc_named_fwnode("Hyper-V ARM64 vPCI");
> > + if (!fn)
> > + goto free_chip;
> > +
> > + hv_msi_gic_irq_domain = acpi_irq_create_hierarchy(0,
> HV_PCI_MSI_SPI_NR,
> > + fn,
> &hv_pci_domain_ops,
> > + chip_data);
> > +
> > + if (!hv_msi_gic_irq_domain) {
> > + pr_err("Failed to create Hyper-V ARMV vPCI MSI IRQ
> domain\n");
> > + goto free_chip;
> > + }
> > +
> > + *parent_domain = hv_msi_gic_irq_domain;
> > + *fasteoi_handler = true;
> > +
> > + /* Delivery mode: Fixed */
> > + *delivery_mode = 0;
>
> I discussed this to death in the previous patch.

Thanks, getting fixed in v4 as part of the move to pci-hyperv.c

> > +
> > + return 0;
> > +
> > +free_chip:
> > + kfree(chip_data);
> > + if (fn)
> > + irq_domain_free_fwnode(fn);
> > +
> > + return ret;
> > +}
> > +EXPORT_SYMBOL(hv_pci_irqchip_init);
> > +
> > +void hv_pci_irqchip_free(void)
> > +{
> > + static struct hv_pci_chip_data *chip_data;
> > +
> > + if (!hv_msi_gic_irq_domain)
> > + return;
> > +
> > + /* Host data cannot be null if the domain was created successfully */
> > + chip_data = hv_msi_gic_irq_domain->host_data;
> > + irq_domain_remove(hv_msi_gic_irq_domain);
>
> No. Once an interrupt controller is enabled, it should never go away,
> because we have no way to ensure that all the corresponding interrupts
> are actually gone. Unless you can prove that at this stage, all
> devices are gone and cannot possibly generate any interrupt, this is
> actively harmful.

Thanks for the comment. Getting fixed in v4.

> >
> > @@ -1597,6 +1602,7 @@ static struct irq_chip hv_msi_irq_chip = {
> > .irq_compose_msi_msg = hv_compose_msi_msg,
> > .irq_set_affinity = hv_set_affinity,
>
> This really is irq_chip_set_affinity_parent.

Yes, but I didn't touch this because that is original code. But, I am updating this
in v4 now.

> > .irq_ack = irq_chip_ack_parent,
> > + .irq_eoi = irq_chip_eoi_parent,
> > .irq_mask = hv_irq_mask,
> > .irq_unmask = hv_irq_unmask,
> > };
>
> Overall, please kill this extra module, move everything into
> pci-hyperv.c and drop the useless abstractions. Once you do that, the
> code will be far easier to reason about.
>

Thanks, yes, this is getting addressed in v4.

- Sunil