Re: Performance regression in write() syscall

From: Andi Kleen
Date: Tue Feb 24 2009 - 04:51:27 EST


On Mon, Feb 23, 2009 at 06:03:04PM -0800, Salman Qazi wrote:
> While the introduction of __copy_from_user_nocache (see commit:
> 0812a579c92fefa57506821fa08e90f47cb6dbdd) may have been an improvement
> for sufficiently large writes, there is evidence to show that it is
> deterimental for small writes. Unixbench's fstime test gives the
> following results for 256 byte writes with MAX_BLOCK of 2000:

Do you have some data on where the cycles are spent?

In theory it should be neutral on small writes.

> @@ -192,14 +192,20 @@ static inline int __copy_from_user_nocache(void *dst, const void __user *src,
> unsigned size)
> {
> might_sleep();
> - return __copy_user_nocache(dst, src, size, 1);
> + if (likely(size >= PAGE_SIZE))
> + return __copy_user_nocache(dst, src, size, 1);
> + else
> + return __copy_from_user(dst, src, size);

I think you disabled it completely, the kernel never really does
any copies larger than page size because all its internal objects
are page sized only.

That check would need to be higher up the VFS stack (above filemap.c code)
before the copies are split up.

-Andi


--
ak@xxxxxxxxxxxxxxx -- Speaking for myself only.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/