Re: [GIT PULL] Ceph updates for 5.20-rc1

From: Jeff Layton
Date: Thu Aug 11 2022 - 17:09:32 EST


On Thu, 2022-08-11 at 22:55 +0200, Ilya Dryomov wrote:
> On Thu, Aug 11, 2022 at 10:04 PM Linus Torvalds
> <torvalds@xxxxxxxxxxxxxxxxxxxx> wrote:
> >
> > On Thu, Aug 11, 2022 at 8:25 AM Ilya Dryomov <idryomov@xxxxxxxxx> wrote:
> > >
> > > [..] Several patches
> > > touch files outside of our normal purview to set the stage for bringing
> > > in Jeff's long awaited ceph+fscrypt series in the near future. All of
> > > them have appropriate acks and sat in linux-next for a while.
> >
> > What? No.
> >
> > I'm looking at the fs/dcache.c change, for example, and don't see the
> > relevant maintainers having acked it at all.
> >
> > The mmdebug.h file similarly seems to not have the actual maintainer
> > acks, and seems just plain stupid (why does it add that new folio
> > warning macro, when the existing folio warning macro
> > VM_WARN_ON_ONCE_FOLIO() is *better*?
> >
> > Those are some very core files, and while the changes seem harmless,
> > they sure don't seem obviously ok.
> >
> > What's the point of warning about bogus folios more than once? That's
> > a debug warning - if it hits even once, that's already "uhhuh,
> > something is bad". Showing the warning more than once is likely just
> > going to cause more problems, not give you more information.
>
> Xiubo and Jeff used it to track down some issues between netfs library
> and folio code that have been randomly plaguing our automated tests for
> a couple of releases. We already knew that there were issues in that
> area and the actual occurrences mattered. This was done in cooperation
> with Willy and, since he was involved and this is a no-impact change,
> I didn't think twice.
>
> >
> > And did somebody verify that d_same_name() is still inlined in the
> > place that truly *matters*? Because from my quick test, that patch
> > broke it. Now __d_lookup() does a function call.
> >
> > And I _suspect_ it's all ok, because it turns out that
> > __d_lookup_rcu() is the *really* hot case, and that one has inlined it
> > all manually.
> >
> > But this kind of "we touch some *truly* core functionality, without
> > the acks from the maintainers, and then we *claim* to have relevant
> > acks" is really not even remotely ok.
>
> I raised the lack of a formal Acked-by from Al on the dcache change
> with Jeff a while ago and my understanding was that he reached out to
> Al and got the ack (after some ghosting on Al's behalf). I apologize
> if I got that wrong -- all this happened in the middle of Jeff
> transitioning his maintainership duties.
>

Actually, I never got a formal ack from Al. I did send it repeatedly,
but I assume he has been too busy to respond. We've had it sitting in
linux-next for a couple of months, and he did suggest that approach in
the first place, but I too would also prefer to see his official ack on
it.


> >
> > I've pulled this because I suspect that d_same_name() thing is fine,
> > and I think the VM_WARN_ON_FOLIO() addition is completely wrong but
> > not horrendous.
> >
> > But you're on my tentative shit-list just for having claimed to have
> > appropriate acks and having been found wanting.
> >
> > Just for your information: fs/dcache.c is some of the most optimized
> > code in the kernel, and some of the subtlest. That RCU pathname lookup
> > is serious business. You don't make changes to pathname lookup just
> > willy nilly. There's a reason I start looking at individual patches
> > when I see it in the diffstat.
>
> Understood.
>
> Thanks,
>
> Ilya

--
Jeff Layton <jlayton@xxxxxxxxxx>