Re: seccomp ptrace selftest failures with 4.4-stable [Was: Re: LTS testing with latest kselftests - some failures]

From: Sumit Semwal
Date: Fri Jun 23 2017 - 00:03:27 EST


Hi Shuah,

On 23 June 2017 at 01:53, Shuah Khan <shuah@xxxxxxxxxx> wrote:
> Hi Tom,
>
> On 06/22/2017 01:48 PM, Tom Gall wrote:
>> Hi
>>
>> On Thu, Jun 22, 2017 at 2:06 PM, Shuah Khan <shuah@xxxxxxxxxx> wrote:
>>> On 06/22/2017 11:50 AM, Kees Cook wrote:
>>>> On Thu, Jun 22, 2017 at 10:49 AM, Andy Lutomirski <luto@xxxxxxxxxx> wrote:
>>>>> On Thu, Jun 22, 2017 at 10:09 AM, Shuah Khan <shuah@xxxxxxxxxx> wrote:
>>>>>> On 06/22/2017 10:53 AM, Kees Cook wrote:
>>>>>>> On Thu, Jun 22, 2017 at 9:18 AM, Sumit Semwal <sumit.semwal@xxxxxxxxxx> wrote:
>>>>>>>> Hi Kees, Andy,
>>>>>>>>
>>>>>>>> On 15 June 2017 at 23:26, Sumit Semwal <sumit.semwal@xxxxxxxxxx> wrote:
>>>>>>>>> 3. 'seccomp ptrace hole closure' patches got added in 4.7 [3] -
>>>>>>>>> feature and test together.
>>>>>>>>> - This one also seems like a security hole being closed, and the
>>>>>>>>> 'feature' could be a candidate for stable backports, but Arnd tried
>>>>>>>>> that, and it was quite non-trivial. So perhaps we'll need some help
>>>>>>>>> from the subsystem developers here.
>>>>>>>>
>>>>>>>> Could you please help us sort this out? Our goal is to help Greg with
>>>>>>>> testing stable kernels, and currently the seccomp tests fail due to
>>>>>>>> missing feature (seccomp ptrace hole closure) getting tested via
>>>>>>>> latest kselftest.
>>>>>>>>
>>>>>>>> If you feel the feature isn't a stable candidate, then could you
>>>>>>>> please help make the test degrade gracefully in its absence?
>>>
>>> In some cases, it is not easy to degrade and/or check for a feature.
>>> Probably several security features could fall in this bucket.
>>>
>>>>>>>
>>>>>>> I don't really want to have that change be a backport -- it's quite
>>>>>>> invasive across multiple architectures.
>>>
>>> Agreed. The same test for kernel applies to tests as well. If a kernel
>>> feature can't be back-ported, the test for that feature will fall in the
>>> same bucket. It shouldn't be back-ported.
>>>
>>>>>>>
>>>>>>> I would say just add a kernel version check to the test. This is
>>>>>>> probably not the only selftest that will need such things. :)
>>>>>>
>>>>>> Adding release checks to selftests is going to problematic for maintenance.
>>>>>> Tests should fail gracefully if feature isn't supported in older kernels.
>>>>>>
>>>>>> Several tests do that now and please find a way to check for dependencies
>>>>>> and feature availability and fail the test gracefully. If there is a test
>>>>>> that can't do that for some reason, we can discuss it, but as a general
>>>>>> rule, I don't want to see kselftest patches that check release.
>>>>>
>>>>> If a future kernel inadvertently loses the new feature and degrades to
>>>>> the behavior of old kernels, that would be a serious bug and should be
>>>>> caught.
>>>
>>> Agreed. If I understand you correctly, by not testing stable kernels
>>> with their own selftests, some serious bugs could go undetected.
>>
>> Personally I'm a bit skeptical. I think the reasoning is more that the
>> latest selftests provide more coverage, and therefore should be better
>> tests, even on older kernels.
>
> The assumption that "the latest selftests provide more coverage, and
> therefore should be better tests, even on older kernels." is incorrect.
>
> Selftests in general track the kernel features. In some cases, new
> tests could be added that provide better coverage on older kernels,
> however, it is more likely that new tests are added to test new kernel
> features and enhancements to existing features. Based on the second
> "enhancements to existing features" it is more important to test newer
> kernels with older selftests. This does happen in kernel integration
> cycles during development.
>
> As a general rule, testing stable kernels with their own selftests will
> yield the best results.
>
I would have agreed totally, if the selftests and the kernel were in
sync since forever. But since the kselftests are a comparatively
recent addition, the number of tests available for features existing
in LTS kernels is really quite small. Just as a comparison, 4.4-LTS
misses tests for bpf, cpufreq, gpio, media_tests, networking, prctl,
to name a few.

Also, while trying to run kselftests from later kernels with 4.4, we
only had a few failures for existing features, while most other tests
ran ok. Just another data point.

>>
>>>>
>>>> Right. I really think stable kernels should be tested with their own
>>>> selftests. If some test is needed in a stable kernel it should be
>>>> backported to that stable kernel.
>>>
>>> Correct. This is always a safe option. There might be cases that even
>>> prevent tests being built, especially if a new feature adds new fields
>>> to an existing structure.
>>>
>>> It appears in some cases, users want to run newer tests on older kernels.
>>> Some tests can clearly detect feature support using module presence and/or
>>> Kconfig enabled or disabled. These are conditions even on a kernel that
>>> supports a new module or new config option. The kernel the test is running
>>> on might not have the feature enabled or module might not be present. In
>>> these cases, it would be easier to detect and skip the test.
>>>
>>> However, some features aren't so easy. For example:
>>>
>>> - a new flag is added to a syscall, and new test is added. It might not
>>> be easy to detect that.
>>> - We might have some tests that can't detect and skip.
>>>
>>> Based on this discussion, it is probably accurate to say:
>>>
>>> 1. It is recommended that selftests from the same release be run on the
>>> kernel.
>>> 2. Selftests from newer kernels will run on older kernels, user should
>>> understand the risks such as some tests might fail and might not
>>> detect feature degradation related bugs.
>>> 3. Selftests will fail gracefully on older releases if at all possible.
>>
>> How about gracefully be skipped instead of fail?
>
> Yes. That is the goal and that is what tests do. Tests do detect
> dependencies on features, modules, config options and decide to skip
> the test. If a test doesn't do that, it gets fixed.
>
>>
>> The later suggests the test case in some situations can detect it's
>> pointless to run something and say as much instead of emitting a
>> failure that would be a waste of time to look into.
>
> Right. Please see above. However, correctly detecting dependencies
> isn't possible in all cases. In some cases, fail is what it can do.
>
>>
>> As another example take tools/testing/selftests/net/psock_fanout.c
>> On 4.9 it'll fail to compile (using master's selftests) because
>> PACKET_FANOUT_FLAG_UNIQUEID isn't defined. Add a simple #ifdef for
>> that symbol and the psock_fanout test will compile and run just fine.
>>
>>> Sumit!
>>>
>>> 1. What are the reasons for testing older kernel with selftests from
>>> newer kernels? What are the benefits you see for doing so?
>>
>> I think the presumption is the latest greatest collection of selftests
>> are the best, most complete.
>
> Not necessarily the case.
>
>>
>>> I am looking to understand the need/reasons for this use-case. In our
>>> previous discussion on this subject, I did say, you should be able to
>>> do so with some exceptions.
>>>
>>> 2. Do you test kernels with the selftests from the same release?
>>
>> We have the ability to do either. The new shiny .... it calls.
>
> If the only reason is "shiny", I would say you might not be getting
> the best results possible.
>
>>
>>> 3. Do you find testing with newer selftests to be useful?
>>
>> I think it comes down to coverage and again the current perception
>> that latest greatest is better. Quantitatively we haven't collected
>> data to support that position tho it would be interesting to compare
>> say a 4.4-lts and it's selftests directory to a mainline, see how much
>> was new and then find out how much of those new selftests actually
>> work on the older 4.4-lts.
>>
>
> As I explained above, The assumption/perception that "the latest selftests
> provide more coverage, and therefore should be better tests, even on older
> kernels." is incorrect.
>
> As per collecting data to see if testing newer selftests provide better
> coverage or not might or might not be worth while exercise. Some releases
> might include tests for existing features and some might not. The mix might
> be different. As a general rule "selftests are intended to track and do track
> features in their release" is a good assumption.
>
> It might be useful to fix tests from newer releases so they "never fail" on
> older releases might not give us the best ROI as whole. These need to be
> evaluated case by case basis.
>
> I would recommend the following approach based on this discussion and now
> that we understand incorrect assumption and/or mis-perception to be the
> basis for choosing to test stable kernels with selftests from new releases.
>
> 1. Testing stable kernels with their own selftests will yield the best
> results.
> 2. Testing stable kernels with newer selftests could be done if user finds
> that it provides better coverage, knowing that there is no guarantee that
> it will.
>
> thanks,
> -- Shuah

Best,
Sumit.