Re: [RFC PATCH 03/11] Drivers: hv: vmbus: Replace the per-CPU channel lists with a global array of channels

From: Vitaly Kuznetsov
Date: Mon Mar 30 2020 - 08:46:01 EST


Andrea Parri <parri.andrea@xxxxxxxxx> writes:

>> Correct me if I'm wrong, but currently vmbus_chan_sched() accesses
>> per-cpu list of channels on the same CPU so we don't need a spinlock to
>> guarantee that during an interrupt we'll be able to see the update if it
>> happened before the interrupt (in chronological order). With a global
>> list of relids, who guarantees that an interrupt handler on another CPU
>> will actually see the modified list?
>
> Thanks for pointing this out!
>
> The offer/resume path presents implicit full memory barriers, program
> -order after the array store which should guarantee the visibility of
> the store to *all* CPUs before the offer/resume can complete (c.f.,
>
> tools/memory-model/Documentation/explanation.txt, Sect. #13
>
> and assuming that the offer/resume for a channel must complete before
> the corresponding handler, which seems to be the case considered that
> some essential channel fields are initialized only later...)
>
> IIUC, the spin lock approach you suggested will work and be "simpler";
> an obvious side effect would be, well, a global synchronization point
> in vmbus_chan_sched()...
>
> Thoughts?

This is, of course, very theoretical as if we're seeing an interrupt for
a channel at the same time we're writing its relid we're already in
trouble. I can, however, try to suggest one tiny improvement:

vmbus_chan_sched() now clean the bit in the event page and then searches
for a channel with this relid; in case we allow the search to
(temporary) fail we can reverse the logic: search for the channel and
clean the bit only if we succeed. In case we fail, next time (next IRQ)
we'll try again and likely succeed. The only purpose is to make sure no
interrupts are ever lost. This may be an overkill, we may want to try
to count how many times (if ever) this happens.

Just a thought though.

--
Vitaly