Re: [PATCH v2 33/39] x86/cpufeatures: Limit shadow stack to Intel CPUs

From: H. Peter Anvin
Date: Tue Oct 04 2022 - 17:18:26 EST


On October 4, 2022 1:50:20 PM PDT, Nathan Chancellor <nathan@xxxxxxxxxx> wrote:
>On Tue, Oct 04, 2022 at 08:34:54PM +0000, Edgecombe, Rick P wrote:
>> On Tue, 2022-10-04 at 14:43 -0500, John Allen wrote:
>> > On 10/4/22 10:47 AM, Nathan Chancellor wrote:
>> > > Hi Kees,
>> > >
>> > > On Mon, Oct 03, 2022 at 09:54:26PM -0700, Kees Cook wrote:
>> > > > On Mon, Oct 03, 2022 at 05:09:04PM -0700, Dave Hansen wrote:
>> > > > > On 10/3/22 16:57, Kees Cook wrote:
>> > > > > > On Thu, Sep 29, 2022 at 03:29:30PM -0700, Rick Edgecombe
>> > > > > > wrote:
>> > > > > > > Shadow stack is supported on newer AMD processors, but the
>> > > > > > > kernel
>> > > > > > > implementation has not been tested on them. Prevent basic
>> > > > > > > issues from
>> > > > > > > showing up for normal users by disabling shadow stack on
>> > > > > > > all CPUs except
>> > > > > > > Intel until it has been tested. At which point the
>> > > > > > > limitation should be
>> > > > > > > removed.
>> > > > > > >
>> > > > > > > Signed-off-by: Rick Edgecombe <rick.p.edgecombe@xxxxxxxxx>
>> > > > > >
>> > > > > > So running the selftests on an AMD system is sufficient to
>> > > > > > drop this
>> > > > > > patch?
>> > > > >
>> > > > > Yes, that's enough.
>> > > > >
>> > > > > I _thought_ the AMD folks provided some tested-by's at some
>> > > > > point in the
>> > > > > past. But, maybe I'm confusing this for one of the other
>> > > > > shared
>> > > > > features. Either way, I'm sure no tested-by's were dropped on
>> > > > > purpose.
>> > > > >
>> > > > > I'm sure Rick is eager to trim down his series and this would
>> > > > > be a great
>> > > > > patch to drop. Does anyone want to make that easy for Rick?
>> > > > >
>> > > > > <hint> <hint>
>> > > >
>> > > > Hey Gustavo, Nathan, or Nick! I know y'all have some fancy AMD
>> > > > testing
>> > > > rigs. Got a moment to spin up this series and run the selftests?
>> > > > :)
>> > >
>> > > I do have access to a system with an EPYC 7513, which does have
>> > > Shadow
>> > > Stack support (I can see 'shstk' in the "Flags" section of lscpu
>> > > with
>> > > this series). As far as I understand it, AMD only added Shadow
>> > > Stack
>> > > with Zen 3; my regular AMD test system is Zen 2 (probably should
>> > > look at
>> > > procurring a Zen 3 or Zen 4 one at some point).
>> > >
>> > > I applied this series on top of 6.0 and reverted this change then
>> > > booted
>> > > it on that system. After building the selftest (which did require
>> > > 'make headers_install' and a small addition to make it build beyond
>> > > that, see below), I ran it and this was the result. I am not sure
>> > > if
>> > > that is expected or not but the other results seem promising for
>> > > dropping this patch.
>> > >
>> > > $ ./test_shadow_stack_64
>> > > [INFO] new_ssp = 7f8a36c9fff8, *new_ssp = 7f8a36ca0001
>> > > [INFO] changing ssp from 7f8a374a0ff0 to 7f8a36c9fff8
>> > > [INFO] ssp is now 7f8a36ca0000
>> > > [OK] Shadow stack pivot
>> > > [OK] Shadow stack faults
>> > > [INFO] Corrupting shadow stack
>> > > [INFO] Generated shadow stack violation successfully
>> > > [OK] Shadow stack violation test
>> > > [INFO] Gup read -> shstk access success
>> > > [INFO] Gup write -> shstk access success
>> > > [INFO] Violation from normal write
>> > > [INFO] Gup read -> write access success
>> > > [INFO] Violation from normal write
>> > > [INFO] Gup write -> write access success
>> > > [INFO] Cow gup write -> write access success
>> > > [OK] Shadow gup test
>> > > [INFO] Violation from shstk access
>> > > [OK] mprotect() test
>> > > [OK] Userfaultfd test
>> > > [FAIL] Alt shadow stack test
>> >
>> > The selftest is looking OK on my system (Dell PowerEdge R6515 w/ EPYC
>> > 7713). I also just pulled a fresh 6.0 kernel and applied the series
>> > including the fix Nathan mentions below.
>> >
>> > $ tools/testing/selftests/x86/test_shadow_stack_64
>> > [INFO] new_ssp = 7f30cccc5ff8, *new_ssp = 7f30cccc6001
>> > [INFO] changing ssp from 7f30cd4c6ff0 to 7f30cccc5ff8
>> > [INFO] ssp is now 7f30cccc6000
>> > [OK] Shadow stack pivot
>> > [OK] Shadow stack faults
>> > [INFO] Corrupting shadow stack
>> > [INFO] Generated shadow stack violation successfully
>> > [OK] Shadow stack violation test
>> > [INFO] Gup read -> shstk access success
>> > [INFO] Gup write -> shstk access success
>> > [INFO] Violation from normal write
>> > [INFO] Gup read -> write access success
>> > [INFO] Violation from normal write
>> > [INFO] Gup write -> write access success
>> > [INFO] Cow gup write -> write access success
>> > [OK] Shadow gup test
>> > [INFO] Violation from shstk access
>> > [OK] mprotect() test
>> > [OK] Userfaultfd test
>> > [OK] Alt shadow stack test.
>>
>> Thanks for the testing. Based on the test, I wonder if this could be a
>> SW bug. Nathan, could I send you a tweaked test with some more debug
>> information?
>
>Yes, more than happy to help you look into this further!
>
>> John, are we sure AMD and Intel systems behave the same with respect to
>> CPUs not creating Dirty=1,Write=0 PTEs in rare situations? Or any other
>> CET related differences we should hash out? Otherwise I'll drop the
>> patch for the next version. (and assuming the issue Nathan hit doesn't
>> turn up anything HW related).

I have to admit to being a bit confused here... in general, we trust CPUID bits unless they are *known* to be wrong, in which case we blacklist them.

If AMD advertises the feature but it doesn't work or they didn't validate it, that would be a (serious!) bug on their part that we can address by blacklisting, but they should also fix with a microcode/BIOS patch.

What am I missing?