Re: [PATCH v2 1/2] seqlock: Do the lockdep annotation before locking in do_write_seqcount_begin_nested()

From: Michal Hocko
Date: Mon Jun 26 2023 - 09:16:16 EST


On Mon 26-06-23 21:27:05, Tetsuo Handa wrote:
> On 2023/06/26 20:35, Michal Hocko wrote:
> > On Mon 26-06-23 20:26:02, Tetsuo Handa wrote:
> >> On 2023/06/26 19:48, Peter Zijlstra wrote:
> >>> On Mon, Jun 26, 2023 at 06:25:56PM +0900, Tetsuo Handa wrote:
> >>>> On 2023/06/26 17:12, Sebastian Andrzej Siewior wrote:
> >>>>> On 2023-06-24 15:54:12 [+0900], Tetsuo Handa wrote:
> >>>>>> Why not to do the same on the end side?
> >>>>>>
> >>>>>> static inline void do_write_seqcount_end(seqcount_t *s)
> >>>>>> {
> >>>>>> - seqcount_release(&s->dep_map, _RET_IP_);
> >>>>>> do_raw_write_seqcount_end(s);
> >>>>>> + seqcount_release(&s->dep_map, _RET_IP_);
> >>>>>> }
> >>>>>
> >>>>> I don't have a compelling argument for doing it. It is probably better
> >>>>> to release the lock from lockdep's point of view and then really release
> >>>>> it (so it can't be acquired before it is released).
> >>>>
> >>>> We must do it because this is a source of possible printk() deadlock.
> >>>> Otherwise, I will nack on PATCH 2/2.
> >>>
> >>> Don't be like that... just hate on prink like the rest of us. In fact,
> >>> i've been patching out the actual printk code for years because its
> >>> unusable garbage.
> >>>
> >>> Will this actually still be a problem once all the fancy printk stuff
> >>> lands? That shouldn't do synchronous prints except to 'atomic' consoles
> >>> by default IIRC.
> >>
> >> Commit 1007843a9190 ("mm/page_alloc: fix potential deadlock on zonelist_update_seq
> >> seqlock") was applied to 4.14-stable trees, and CONFIG_PREEMPT_RT is available
> >> since 5.3. Thus, we want a fix which can be applied to 5.4-stable and later.
> >> This means that we can't count on all the fancy printk stuff being available.
> >
> > Is there any reason to backport RT specific fixup to stable trees? I
> > mean seriously, is there any actual memory hotplug user using
> > PREEMPT_RT? I would be more than curious to hear the usecase.
>
> Even if we don't backport RT specific fixup to stable trees, [PATCH 2/2] requires
> that [PATCH 1/2] guarantees that synchronous printk() never happens (for whatever
> reasons) between write_seqlock_irqsave(&zonelist_update_seq, flags) and
> write_sequnlock_irqrestore(&zonelist_update_seq, flags).

I suspect you are overcomplicating this. I do understand that you want
to have this 100% airtight but I would argue that this is actually not
really necessary. I would be perfectly fine living in the world where
this particular path could trigger an unintended printk. IIUC we are
mostly talking about lockup detector only, right? AFAIK there is no such
na issue _now_ so we are talking about a potential _risk_ only.

> If [PATCH 1/2] cannot guarantee it, [PATCH 2/2] will be automatically rejected.
>
> If [PATCH 2/2] cannot be applied, we have several alternatives.
>
> Alternative 1:
>
> Revert both commit 3d36424b3b58 ("mm/page_alloc: fix race condition between build_all_zonelists and page allocation")
> and commit 1007843a9190 ("mm/page_alloc: fix potential deadlock on zonelist_update_seq seqlock").
> I don't think this will happen, for nobody will be happy.
>
> Alternative 2:
>
> Revert commit 1007843a9190 ("mm/page_alloc: fix potential deadlock on zonelist_update_seq seqlock")
> and apply "mm/page_alloc: don't check zonelist_update_seq from atomic allocations" at
> https://lkml.kernel.org/r/dfdb9da6-ca8f-7a81-bfdd-d74b4c401f11@xxxxxxxxxxxxxxxxxxx .
> I think this is reasonable, for this reduces locking dependency. But Michal Hocko did not like it.
>
> Alternative 3:
>
> Somehow preserve printk_deferred_enter() => write_seqlock(&zonelist_update_seq) and
> write_sequnlock(&zonelist_update_seq) => printk_deferred_exit() pattern. Something like below?
>

Alternative 4:
stop chasing shadows and deal with the fact that this code won't be
perfect. Seriously you are trying to address a non-existing problem and
blocking a working RT solution which doesn't clutter the code with RT
specific baggage.
--
Michal Hocko
SUSE Labs