Re: [PATCH v1 1/5] mm/memory_hotplug: check for fatal signals only in offline_pages()

From: Michal Hocko
Date: Tue Jun 27 2023 - 10:08:06 EST


On Tue 27-06-23 15:28:29, David Hildenbrand wrote:
> On 27.06.23 14:34, Michal Hocko wrote:
> > On Tue 27-06-23 13:22:16, David Hildenbrand wrote:
> > > Let's check for fatal signals only. That looks cleaner and still keeps
> > > the documented use case for manual user-space triggered memory offlining
> > > working. From Documentation/admin-guide/mm/memory-hotplug.rst:
> > >
> > > % timeout $TIMEOUT offline_block | failure_handling
> > >
> > > In fact, we even document there: "the offlining context can be terminated
> > > by sending a fatal signal".
> >
> > We should be fixing documentation instead. This could break users who do
> > have a SIGALRM signal hander installed.
>
> You mean because timeout will send a SIGALRM, which is not considered fatal
> in case a signal handler is installed?

Correct.

> At least the "traditional" tools I am aware of don't set a timeout at all
> (crossing fingers that they never end up stuck):
> * chmem
> * QEMU guest agent
> * powerpc-utils
>
> libdaxctl also doesn't seem to implement an easy-to-spot timeout for memory
> offlining, but it also doesn't configure SIGALRM.
>
>
> Of course, that doesn't mean that there isn't somewhere a program that does
> that; I merely assume that it would be pretty unlikely to find such a
> program.
>
> But no strong opinion: we can also keep it like that, update the doc and add
> a comment why this one here is different than most other signal backoff
> checks.

Well, the existing signal handling approach is there for way too long to
be sure. I personally would prefer fatal_signal_pending as that reflects
more what we do elsewhere but here we are. Historical baggage...
--
Michal Hocko
SUSE Labs