Re: [PATCH] arm64/mm: adds soft dirty page tracking

From: Shivansh Vij
Date: Tue Mar 12 2024 - 18:32:39 EST


Hi David,

On Tue, Mar 12, 2024 at 09:22:25AM +0100, David Hildenbrand wrote:
> On 12.03.24 02:16, Shivansh Vij wrote:
>
> Hi,
>
> > Checkpoint-Restore in Userspace (CRIU) needs to be able
> > to track a memory page's changes if we want to enable
> > pre-dumping, which is important for live migrations.
> >
> > The PTE_DIRTY bit (defined in pgtable-prot.h) is already
> > used to track software dirty pages, and the PTE_WRITE and
> > PTE_READ bits are used to track hardware dirty pages.
> >
> > This patch enables full soft dirty page tracking
> > (including swap PTE support) for arm64 systems, and is
> > based very closely on the x86 implementation.
> >
> > It is based on an unfinished patch by
> > Bin Lu (bin.lu@xxxxxxx) from 2017
> > (https://patchwork.kernel.org/project/linux-arm-kernel/patch/1512029649-61312-1-git-send-email-bin.lu@xxxxxxx/),
> > but has been updated for newer 6.x kernels as well as
> > tested on various 5.x kernels.
>
> There has also been more recently:
>
> https://lore.kernel.org/lkml/20230703135526.930004-1-npache@xxxxxxxxxx/#r
>
> I recall that we are short on SW PTE bits:
>
> "
> So if you need software dirty, it can only be done with another software
> PTE bit. The problem is that we are short of such bits (only one left if
> we move PTE_PROT_NONE to a different location). The userfaultfd people
> also want such bit.
>
> Personally I'd reuse the four PBHA bits but I keep hearing that they may
> be used with some out of tree patches.
> "
>
> https://lore.kernel.org/lkml/ZLQIaSMI74KpqsQQ@xxxxxxx/

If I'm understanding the previous discussion (https://patchwork.kernel.org/project/linux-arm-kernel/patch/20230703135526.930004-1-npache@xxxxxxxxxx/) correctly, the core issue is that we actually do need to use a special SW PTE bit (like the PTE_SOFT_DIRTY that's in this patch) - but at the same time, the PTE bits are highly contentious so it would be ideal if we could reuse an existing bit (maybe one of the PBHA bits like you suggested) instead of creating a new one.

Is my understanding correct?

Thanks,
Shivansh