Re: [PATCH 3/5] iommu/s390: Use RCU to allow concurrent domain_list iteration

From: Niklas Schnelle
Date: Fri Oct 21 2022 - 11:02:01 EST


On Fri, 2022-10-21 at 10:36 -0300, Jason Gunthorpe wrote:
> On Fri, Oct 21, 2022 at 02:08:02PM +0200, Niklas Schnelle wrote:
> > On Thu, 2022-10-20 at 08:05 -0300, Jason Gunthorpe wrote:
> > > On Thu, Oct 20, 2022 at 10:51:10AM +0200, Niklas Schnelle wrote:
> > >
> > > > Ok that makes sense thanks for the explanation. So yes my assessment is
> > > > still that in this situation the IOTLB flush is architected to return
> > > > an error that we can ignore. Not the most elegant I admit but at least
> > > > it's simple. Alternatively I guess we could use call_rcu() to do the
> > > > zpci_unregister_ioat() but I'm not sure how to then make sure that a
> > > > subsequent zpci_register_ioat() only happens after that without adding
> > > > too much more logic.
> > >
> > > This won't work either as the domain could have been freed before the
> > > call_rcu() happens, the domain needs to be detached synchronously
> > >
> > > Jason
> >
> > Yeah right, that is basically the same issue I was thinking of for a
> > subsequent zpci_register_ioat(). What about the obvious one. Just call
> > synchronize_rcu() before zpci_unregister_ioat()?
>
> Ah, it can be done, but be prepared to wait >> 1s for synchronize_rcu
> to complete in some cases.
>
> What you have seems like it could be OK, just deal with the ugly racy
> failure
>
> Jason

I'd tend to go with synchronize_rcu(). It won't leave us with spurious
error logs for the failed IOTLB flushes and as you said one expects
detach to be synchronous. I don't think waiting in it will be a
problem. But this is definitely something you're more of an expert on
so I'll trust your judgement. Looking at other callers of
synchronize_rcu() quite a few of them look to be in similar
detach/release kind of situations though not sure how frequent and
performance critical IOMMU domain detaching is in comparison.

Thanks,
Niklas