Re: [PATCH 0/4] mm: Fix pmd_trans_unstable() call sites on retry

From: Yang Shi
Date: Wed Jun 07 2023 - 12:40:02 EST


On Wed, Jun 7, 2023 at 9:21 AM Peter Xu <peterx@xxxxxxxxxx> wrote:
>
> On Wed, Jun 07, 2023 at 05:45:28PM +0200, David Hildenbrand wrote:
> > On 07.06.23 15:49, Peter Xu wrote:
> > > On Fri, Jun 02, 2023 at 07:05:48PM -0400, Peter Xu wrote:
> > > > Please have a look, thanks.
> > >
> > > Hello, all,
> > >
> > > This one seems to have more or less conflict with Hugh's rework on pmd
> > > collapse. Please hold off review or merging until I prepare another one
> > > (probably based on Hugh's, after I have a closer read).
> > >
> > > Sorry for the noise.
> > >
> >
> > [did not have time to look yet]
> >
> > Are there any fixes buried in there that we'd like to have in earlier? I
> > skimmed over the patches and all read like "cleanup" + "consistency",
> > correct?
>
> There are bug fixes when unluckily hitting unstable pmd I think, these ones
> worth mentioning:
>
> - pagemap can be broken, causing read to be shifted over to the next
> (wrong data read)

Yes, it may corrupt the pagemap data. But anyway it seems like nobody
was busted by this one as you said.

>
> - memcg wrong accounting, e.g., moving one task from memcg1 to memcg2, we
> can skip an unstable pmd while it could quickly contain something that
> can belong to memcg1, I think. This one needs some eyes from memcg
> developers.

I don't think this is an important thing. There are plenty of other
conditions that could make the accounting inaccurate, for example,
isolating page from LRU fails, force charge, etc. And it seems like
nobody was bothered by this either.

>
> I don't rush on having them because these are all theoretical and no bug
> report I saw, no reproducer I wrote, only observed by my eyes.
>
> At least the pagemap issue should have been there for 10+ years without
> being noticed even if rightfully spot this time. Meanwhile this seems to
> have conflict with Hugh's series which should have been posted earlier - I
> still need to check on how that will affect this series, but not yet.
>
> Said that, let me know if any of you hit any (potential) issue with above
> or think that we should to move this in earlier.
>
> Thanks,
>
> --
> Peter Xu
>
>