Re: [PATCH v1 1/1] lkdtm/stackleak: Make the stack erasing test more verbose

From: Kees Cook
Date: Mon Dec 30 2019 - 17:46:57 EST


On Tue, Dec 31, 2019 at 01:20:24AM +0300, Alexander Popov wrote:
> Hello Kees!
>
> On 30.12.2019 21:37, Kees Cook wrote:
> > On Thu, Dec 19, 2019 at 05:54:16PM +0300, Alexander Popov wrote:
> >> Make the stack erasing test more verbose about the errors that it
> >> can detect. BUG() in case of test failure is useful when the test
> >> is running in a loop.
> >
> > Hi! I try to keep the "success" conditions for LKDTM tests to be a
> > system exception, so doing "BUG" on a failure is actually against the
> > design. So, really, a test harness needs to know to check dmesg for the
> > results here. It almost looks like this check shouldn't live in LKDTM,
>
> Hm, I see...
>
> Let me explain why I've decided to use BUG() in case of a failure.
>
> Once upon a time I noticed that the stack erasing test failed on a kernel with
> KASAN enabled. It happened only once, and all my numerous efforts to reproduce
> it failed. That's why I come with this patch. These changes provide additional
> information and allow easy detection of a failure when you run the test in a loop.
>
> Is stackleak test the only exception of this kind in LKDTM?

Some of the refcount_t tests don't trigger a WARN(), and there are
related benchmarking tests that don't either.

> > but since it feels like other LKDTM tests, I'm happy to keep it there
> > for now.
>
> Do you mean that you will apply this patch?

Sorry for my confusing reply! I meant that I don't want to apply the
patch, but I'm find to leave the stackleak check in LKDTM.

However, if you want to split it out into its own test, I think that
should be fine; similar to lib/test_user_copy.c if you want it to stand
alone and have its own semantics, etc.

> > I'll resend my selftests series that adds a real test harness for all
> > the LKDTM tests and CC you.
>
> Ok!
>
> Maybe you also see how to improve the LKDTM infrastructure and remove this
> inconsistency. Could you share your ideas?

I don't, unfortunately. The real "difficulty" is that some of the
crashes are architecture-specific (e.g. how MMU traps are reported
across different architectures), so it's not too easy to consolidate
the reporting. As a result, I've taken to trying to do best-effort on
the test running side. I'll send what I've got...

--
Kees Cook