Re: [PATCH] bdi: Fix another oops in wb_workfn()

From: Jan Kara
Date: Fri Jun 15 2018 - 08:06:30 EST


On Wed 13-06-18 07:33:15, Tejun Heo wrote:
> Hello, Jan.
>
> On Tue, Jun 12, 2018 at 05:57:54PM +0200, Jan Kara wrote:
> > > Yeah, right, so the root cause is that we're walking the wb_list while
> > > holding lock and expecting the object to stay there even after lock is
> > > released. Hmm... we can use a mutex to synchronize the two
> > > destruction paths. It's not like they're hot paths anyway.
> >
> > Hmm, do you mean like having a per-bdi or even a global mutex that would
> > protect whole wb_shutdown()? Yes, that should work and we could get rid of
> > WB_shutting_down bit as well with that. Just it seems a bit strange to
>
> Yeap.
>
> > introduce a mutex only to synchronize these two shutdown paths - usually
> > locks protect data structures and in this case we have cgwb_lock for
> > that so it looks like a duplication from a first look.
>
> Yeah, I feel a bit reluctant too but I think that's the right thing to
> do here. This is an inherently weird case where there are two ways
> that an object can go away with the immediate drain requirement from
> one side. It's not a hot path and the dumber the synchronization the
> better, right?

Yeah, fair enough. Something like attached patch? It is indeed considerably
simpler than fixing synchronization using WB_shutting_down. This one even
got some testing using scsi_debug, I want to do more testing next week with
more cgroup writeback included.

Honza
--
Jan Kara <jack@xxxxxxxx>
SUSE Labs, CR