[RFC PATCH] irq_fpu_usable returns false negatives with eagerfpu

From: Pekka Riikonen
Date: Tue May 07 2013 - 06:41:52 EST


Hi,

With the addition of eagerfpu the irq_fpu_usable() now returns false negatives especially in the case of ksoftirqd and interrupted idle task, two common cases for FPU use for example in networking. This is because of the eagerfpu check in interrupted_kernel_fpu_idle():

...
* For now, with eagerfpu we will return interrupted kernel FPU
* state as not-idle. TBD: Ideally we can change the return value
* to something like __thread_has_fpu(current). But we need to
* be careful of doing __thread_clear_has_fpu() before saving
* the FPU etc for supporting nested uses etc. For now, take
* the simple route!
...
if (use_eager_fpu())
return 0;

This should be fixed as eagerfpu is commonly "on" by default on those CPUs that also have the features like AES-NI.

The enclosed patch changes the eagerfpu check to return 1 in case the kernel_fpu_begin() has not been said yet. Once it has been the __thread_has_fpu() will start returning 0. The state is restored in kernel_fpu_end(). The state is thus always saved no matter what task is under us, unless it has been saved already with kernel_fpu_begin().

This does cause the penalty of unnecessary FPU saving in the cases of ksoftirqd and interrupted idle task but presumably it is cheaper than clts/stts. Still, there could also be separate checks for ksoftirqd (new check like in_ksoftirqd()) and idle task and then not save the state at all. As far as I can see it would always be safe. This is not in the patch.

I've tested this lightly, but it seems to work. Have I missed something?

Signed-off-by: Pekka Riikonen <priikone@xxxxxx>
---
diff --git a/arch/x86/kernel/i387.c b/arch/x86/kernel/i387.c
index 245a71d..cb33909 100644
--- a/arch/x86/kernel/i387.c
+++ b/arch/x86/kernel/i387.c
@@ -22,23 +22,19 @@
/*
* Were we in an interrupt that interrupted kernel mode?
*
- * For now, with eagerfpu we will return interrupted kernel FPU
- * state as not-idle. TBD: Ideally we can change the return value
- * to something like __thread_has_fpu(current). But we need to
- * be careful of doing __thread_clear_has_fpu() before saving
- * the FPU etc for supporting nested uses etc. For now, take
- * the simple route!
- *
* On others, we can do a kernel_fpu_begin/end() pair *ONLY* if that
* pair does nothing at all: the thread must not have fpu (so
* that we don't try to save the FPU state), and TS must
* be set (so that the clts/stts pair does nothing that is
* visible in the interrupted kernel thread).
+ *
+ * Except for the eagerfpu case when we return 1 unless we've already
+ * been eager and saved the state in kernel_fpu_begin().
*/
static inline bool interrupted_kernel_fpu_idle(void)
{
if (use_eager_fpu())
- return 0;
+ return __thread_has_fpu(current);

return !__thread_has_fpu(current) &&
(read_cr0() & X86_CR0_TS);
@@ -78,8 +74,8 @@ void __kernel_fpu_begin(void)
struct task_struct *me = current;

if (__thread_has_fpu(me)) {
- __save_init_fpu(me);
__thread_clear_has_fpu(me);
+ __save_init_fpu(me);
/* We do 'stts()' in __kernel_fpu_end() */
} else if (!use_eager_fpu()) {
this_cpu_write(fpu_owner_task, NULL);
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/