Ingo Molnar <email@example.com> writes:
> actually the opposite is true, on a 2.2 GHz P4:
> $ ./lat_sig catch
> Signal handler overhead: 3.091 microseconds
> $ ./lat_ctx -s 0 2
> 2 0.90
> ie. *process to process* context switches are 3.4 times faster than signal
> delivery. Ie. we can switch to a helper thread and back, and still be
> faster than a *single* signal.
This is because the signal save/restore does a lot of unnecessary stuff.
One optimization I implemented at one time was adding a SA_NOFP signal
bit that told the kernel that the signal handler did not intend
to modify floating point state (few signal handlers need FP) It would
not save the FPU state then and reached quite some speedup in signal
Linux got a lot slower in signal delivery when the SSE2 support was
added. That got this speed back.
The target were certain applications that use signal handlers for async
If there is interest I can dig up the old patches. They were really simple.
x86-64 does it also faster by FXSAVE'ing directly to the user space
frame with exception handling instead of copying manually. But that's
not possible in i386 because it still has to use the baroque iBCS
FP context format on the stack.
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to firstname.lastname@example.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
This archive was generated by hypermail 2b29 : Wed Aug 07 2002 - 22:00:25 EST