Re: IRQ affinity problem from virtio_blk

From: Ming Lei
Date: Wed Nov 16 2022 - 06:50:18 EST


On Wed, Nov 16, 2022 at 11:43:24AM +0100, Thomas Gleixner wrote:
> On Tue, Nov 15 2022 at 18:36, Michael S. Tsirkin wrote:
> > On Wed, Nov 16, 2022 at 12:24:24AM +0100, Thomas Gleixner wrote:
> >> I just checked on a random VM. The PCI device as advertised to the guest
> >> does not expose that many vectors. One has 2 and the other 4.
> >>
> >> But as the interrupts are requested 'managed' the core ends up setting
> >> the vectors aside. That's a fundamental property of managed interrupts.
> >>
> >> Assume you have less queues than CPUs, which is the case with 2 vectors
> >> and tons of CPUs, i.e. one ends up for config and the other for the
> >> actual queue. So the affinity spreading code will end up having the full
> >> cpumask for the queue vector, which is marked managed. And managed means
> >> that it's guaranteed e.g. in the CPU hotplug case that the interrupt can
> >> be migrated to a still online CPU.
> >>
> >> So we end up setting 79 vectors aside (one per CPU) in the case that the
> >> virtio device only provides two vectors.
> >>
> >> But that's not the end of the world as you really would need ~200 such
> >> devices to exhaust the vector space...
> >
> > Let's say we have 20 queues - then just 10 devices will exhaust the
> > vector space right?
>
> No.
>
> If you have 20 queues then the queues are spread out over the
> CPUs. Assume 80 CPUs:
>
> Then each queue is associated to 80/20 = 4 CPUs and the resulting
> affinity mask of each queue contains exactly 4 CPUs:
>
> q0: 0 - 3
> q1: 4 - 7
> ...
> q19: 76 - 79
>
> So this puts exactly 80 vectors aside, one per CPU.
>
> As long as at least one CPU of a queue mask is online the queue is
> enabled. If the last CPU of a queue mask goes offline then the queue is
> shutdown which means the interrupt associated to the queue is shut down
> too. That's all handled by the block MQ and the interrupt core. If a CPU
> of a queue mask comes back online then the guaranteed vector is
> allocated again.
>
> So it does not matter how many queues per device you have it will
> reserve exactly ONE interrupt per CPU.
>
> Ergo you need 200 devices to exhaust the vector space.

Hi Thomas,

I am wondering why one interrupt needs to be reserved for each CPU, in
theory one queue needs one irq, I understand, so would you mind
explaining the story a bit?


Thanks,
Ming