Re: 32-bit PTI with THP = userspace corruption

From: Joerg Roedel
Date: Tue Sep 11 2018 - 07:49:33 EST


Hi,

[
Andrea, maybe you can have a quick look here too, please? Maybe I am
overlooking a simple way to fix the issue. Problem description is
below.
]

On Sat, Sep 08, 2018 at 12:24:10PM +0200, Thomas Gleixner wrote:
> > I'll try to reproduce and work on a fix.
>
> Any progress on this?

Yes, but slower than I hoped because an infection sent me to bed for a
couple of days :/

So I can reproduce the issue, and the core problem is that with 32-bit
legacy paging the PGD level is also the huge-page level. This means that
we have two huge PTEs for every mapping and also two places where we
have to look for A/D bits. The problem now is that the kernel only looks
at the huge PTE in the kernel page-table when it evaluates A/D bits.
This causes data corruption when it misses an A/D bit.

I had a look into the THP and the HugeTLBfs code, and that is not
really easy to fix there. As I can see it now, there are a few options
to fix that, but most of them are ugly:

1) Use Software A/D bits for 2-level legacy paging (ugly because
we need separate PAGE_* macros for that paging mode then)

2) Update all the places in THP and HugeTLBfs code that
evaluate A/D bits to take both PTEs into account (ugly too
for obvious reasons)

3) Disable THP and HugeTLBfs on 2-level paging kernels when PTI
is enabled (ugly because it breaks userspace expectations)

4) Disable PTI support on 2-level paging by making it dependent
on CONFIG_X86_PAE. This is, imho, the least ugly option
because the machines that do not support PAE are most likely
too old to be affected my Meltdown anyway. We might also
consider switching i386_defconfig to PAE?

I am not a THP or HugeTLBfs expert and maybe I am overlooking a simpler
way to fix this issue. But as it stands now I am in favour for option
number 4.

Any other thoughts?

Thanks,

Joerg