Re: kvm splat in mmu_spte_clear_track_bits

From: Michal Hocko
Date: Wed Aug 30 2017 - 04:19:52 EST


On Tue 29-08-17 16:09:24, Andrea Arcangeli wrote:
[...]
> The other bug where you can reproduce the same corruption with OOM is
> unrelated and caused by the OOM reaper. OOM reaper was even corrupting
> data if a task was writing to disk and stuck in OOM in write() syscall
> or async io write.
>
> To fix the KVM corruption in the OOM reaper, it needs to call
> mmu_notifier_invalidate_start/end around
> oom_kill.c:unmap_page_range. This additional
> mmu_notifier_invalidate_start will not be good for the OOM reaper
> because it's yet another case (like the mmap_sem for writing) that
> will prevent the OOM reaper to run, so hindering its ability to hide
> XFS OOM deadlocks, and making those resurface. Not in KVM case because
> we use a spinlock to serialize against the secondary MMU activity and
> the KVM critical section under spinlock isn't going to allocate
> memory, but range_start can schedule or block on slow hardware where
> the secondary MMU is accessed through PCI (not KVM case).

I am not really familiar with mmu notifiers and what they can actually
do. But from what you wrote above it is indeed not very safe to call
them from the oom reaper. So I will prepare and post a patch to disable
the reaper when mm_has_notifiers().

Thanks for pointing this out.

--
Michal Hocko
SUSE Labs