Re: i386 single-step vs int $0x80 issues

From: Jason Wessel
Date: Mon Apr 21 2008 - 14:02:00 EST


Roland McGrath wrote:
> Jason made a change, 1e2e99f0e4aa6363e8515ed17011c210c8f1b52a on 2007-7-6:
>
> i386: fix regression, endless loop in ptrace singlestep over an int80
>
> I'm trying to figure out what the full story behind that was. The
> log message includes source for a test program. I cannot reproduce
> anything like the problem described. I tried it when building the
> kernel sources from the state just before that commit, as well as
> the current kernel with that commit's patch reverted.
>
> The list traffic I found about this did not seem to say it was an
> intermittent problem. I really cannot understand how the failure
> mode described could have been happening (except in one racy way on
> SMP only, that I don't know how to provoke). The logic of the
> change is wrong IMHO, and it broke some cases that worked before it
> (stepping into sigreturn).


Certainly I am interested in making all the cases work correctly. The
failure behavior was observed on an SMP system. I re-tested to
confirm the problem was still there.

>
> The description of the behavior of the test suggests it assumed
> that libc calls like write would use an int $0x80 syscall, which
> is not something you can rely on. I replaced the "write" call in
> the test with:
>
> asm volatile ("push %%ebx; mov %1,%%ebx; int $0x80; pop %%ebx"
> : "=a" (ret)
> : "g" (1), "a" (4), "c" (str), "d" (sizeof str - 1)
> : "ebx");
>
> But still I could not find any way to reproduce the failure mode
> that Jason's report described.
>
> The patch below and the comments it includes describe what's going
> on, why the 1e2e99f0... change was wrong, and revert it while fixing
> the one thing I saw wrong with Chuck's 635cf99a... change.
>
> But I'm not submitting this change now. Firstly, I really want to
> understand what it was that Jason saw and if there is some scenario
> here I have overlooked. Secondly, while doing this I realized there
> are some 32/64 differences in how all this handling works, and I
> think I'll rejigger it all some more to clean it up.
>
>

Certainly I'll sign off on a "tested-by" or "acked-by" header. I
tested your changes with the tip of the kernel tree on the same system
where I first saw the problem and it does not occur.

Ideally the handling on 32/64 can be closer to the same logic.

Thanks,
Jason.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/