RE: [PATCH V2 1/1] x86/sgx: Add code to inject hwpoison into SGX memory

From: Thomas Tai
Date: Tue Oct 11 2022 - 08:53:23 EST


> -----Original Message-----
> From: HORIGUCHI NAOYA(堀口 直也) <naoya.horiguchi@xxxxxxx>
> Sent: October 7, 2022 2:34 AM
> To: Thomas Tai <thomas.tai@xxxxxxxxxx>
> Cc: tony.luck@xxxxxxxxx; dave.hansen@xxxxxxxxxxxxxxx; jarkko@xxxxxxxxxx;
> reinette.chatre@xxxxxxxx; linmiaohe@xxxxxxxxxx; akpm@linux-
> foundation.org; linux-mm@xxxxxxxxx; linux-kernel@xxxxxxxxxxxxxxx
> Subject: Re: [PATCH V2 1/1] x86/sgx: Add code to inject hwpoison into SGX
> memory
>
> On Wed, Sep 28, 2022 at 11:38:32AM -0400, Thomas Tai wrote:
> > Inspired by commit c6acb1e7bf46 (x86/sgx: Add hook to error injection
> > address validation), add a similar code in hwpoison_inject function to
> > check if the address is located in SGX Memory. The error will then be
> > handled by the arch_memory_failure function in the SGX driver.
> >
> > Signed-off-by: Thomas Tai <thomas.tai@xxxxxxxxxx>
>
> Thank you for sending patch.
>
> > ---
> > Documentation/mm/hwpoison.rst | 44
> +++++++++++++++++++++++++++++++++++
> > mm/hwpoison-inject.c | 4 ++++
> > 2 files changed, 48 insertions(+)
> >
> > diff --git a/Documentation/mm/hwpoison.rst
> b/Documentation/mm/hwpoison.rst
> > index b9d5253c1305..8a542aca4744 100644
> > --- a/Documentation/mm/hwpoison.rst
> > +++ b/Documentation/mm/hwpoison.rst
> > @@ -162,6 +162,50 @@ Testing
> >
> > Some portable hwpoison test programs in mce-test, see below.
> >
> > +* Special notes for injection into SGX enclaves
> > +
> > + 1) Determine physical address of enclave page
> > +
> > + dmesg | grep "sgx: EPC"
> > +
> > + sgx: EPC section 0x8000c00000-0x807f7fffff
> > + sgx: EPC section 0x10000c00000-0x1007fffffff
> > +
> > + 2) Convert the EPC address to page frame number.
> > +
> > + For 4K page size, the page frame number for 0x8000c00000 is
> > + 0x8000c00000 / 0x1000 = 0x8000c00.
> > +
> > + 3) Trace memory_failure
> > +
> > + echo nop > /sys/kernel/tracing/current_tracer
> > + echo *memory_failure > /sys/kernel/tracing/set_ftrace_filter
> > + echo function > /sys/kernel/tracing/current_tracer
> > +
> > + 4) Inject a memory error
> > +
> > + modprobe hwpoison-inject
> > + echo "0x8000c00" > /sys/kernel/debug/hwpoison/corrupt-pfn
> > +
> > + 5) Check the trace output
> > +
> > + cat /sys/kernel/tracing/trace
> > +
> > + # tracer: function
> > + #
> > + # entries-in-buffer/entries-written: 2/2 #P:128
> > + #
> > + # _-----=> irqs-off
> > + # / _----=> need-resched
> > + # | / _---=> hardirq/softirq
> > + # || / _--=> preempt-depth
> > + # ||| / _-=> migrate-disable
> > + # |||| / delay
> > + # TASK-PID CPU# ||||| TIMESTAMP FUNCTION
> > + # | | | ||||| | |
> > + bash-12167 [002] ..... 113.136808: memory_failure<-
> simple_attr_write
> > + bash-12167 [002] ..... 113.136810: arch_memory_failure<-
> memory_failure
>
> In other cases of page types, memory_failure() leaves some kernel message
> like "Memory failure: 0x10cf09: recovery action for free buddy page:
> Recovered",
> which is printed out by action_result(). So I think it's better to adjust to
> this convention also in SGX page's case. Then, you don't have to use ftrace
> to confirm the result of error injection.

Hi Naoya,
Thanks for your suggestion, I will look into it.

Thomas

>
> Thanks,
> Naoya Horiguchi
>
> > +
> > References
> > ==========
> >
> > diff --git a/mm/hwpoison-inject.c b/mm/hwpoison-inject.c
> > index 65e242b5a432..bf83111c1d9b 100644
> > --- a/mm/hwpoison-inject.c
> > +++ b/mm/hwpoison-inject.c
> > @@ -21,6 +21,10 @@ static int hwpoison_inject(void *data, u64 val)
> > if (!capable(CAP_SYS_ADMIN))
> > return -EPERM;
> >
> > + /* Inject the error if the page is part of the processor reserved memory
> */
> > + if (arch_is_platform_page(pfn << PAGE_SHIFT))
> > + goto inject;
> > +
> > if (!pfn_valid(pfn))
> > return -ENXIO;
> >
> > --
> > 2.31.1