Re: [PATCH 0/4] kdump: crashkernel reservation from CMA

From: Pingfan Liu
Date: Mon Nov 27 2023 - 21:10:15 EST


On Sun, Nov 26, 2023 at 5:24 AM Jiri Bohac <jbohac@xxxxxxx> wrote:
>
> Hi Tao,
>
> On Sat, Nov 25, 2023 at 09:51:54AM +0800, Tao Liu wrote:
> > Thanks for the idea of using CMA as part of memory for the 2nd kernel.
> > However I have a question:
> >
> > What if there is on-going DMA/RDMA access on the CMA range when 1st
> > kernel crash? There might be data corruption when 2nd kernel and
> > DMA/RDMA write to the same place, how to address such an issue?
>
> The crash kernel CMA area(s) registered via
> cma_declare_contiguous() are distinct from the
> dma_contiguous_default_area or device-specific CMA areas that
> dma_alloc_contiguous() would use to reserve memory for DMA.
>
> Kernel pages will not be allocated from the crash kernel CMA
> area(s), because they are not GFP_MOVABLE. The CMA area will only
> be used for user pages.
>
> User pages for RDMA, should be pinned with FOLL_LONGTERM and that
> would migrate them away from the CMA area.
>
> But you're right that DMA to user pages pinned without
> FOLL_LONGTERM would still be possible. Would this be a problem in
> practice? Do you see any way around it?
>

I have not a real case in mind. But this problem has kept us from
using the CMA area in kdump for years. Most importantly, this method
will introduce an uneasy tracking bug.

For a way around, maybe you can introduce a specific zone, and for any
GUP, migrate the pages away. I have doubts about whether this approach
is worthwhile, considering the trade-off between benefits and
complexity.

Thanks,

Pingfan