Re: performance of filesystem xattrs with Samba4

From: Hans Reiser
Date: Sat Nov 20 2004 - 21:40:31 EST


New benchmarks seem to be especially good at finding bugs.

vs, please find the bug and fix it.

Hans

tridge@xxxxxxxxx wrote:

Hans,

> mkfs.reiser4 -o extent=extent40

This lowered the performance by a small amount (from 52 MB/sec to 50
MB/sec).

It also revealed a bug. I have been doing my tests on a cleanly
formatted filesystem each time, but this time I re-ran the test a few
times in a row to determine just how consistent the results are. The
results I got were:

mkfs.reiser4 -o extent=extent40 50 MB/sec
48
43
41
37 (stuck)

the "stuck" result meant that smbd locked into a permanent D state at
the end of the fifth run. Unfortunately ps showed the wait-channel as
'-' so I don't have any more information about the bug. I needed to
power cycle the machine to recover.

To check if this is reproducable I tried it again and got the following:

reboot, mkfs again 50 MB/sec
48
44
42
40
(failed)

the "failed" on the sixth run was smbd stuck in D state again, this
time before the run completed so I didn't get a performance number.

I should note that the test completely wipes the directory tree
between runs, and the server processes restart, so the only way there
can be any state remaining that explains the slowdown between runs is
a filesystem bug. Do you think reiser4 could be "leaking" some on-disk
structures?

To determine if this problem is specific to the extent=extent40
option, I ran the same series of tests against reiser4 without the
extent option:

reboot, mkfs.reiser4 without options 52 MB/sec
52
45
41
(failed)

The failure on the fifth run showed the same symptoms as above.

To determine if the bug is specific to reiser4, I then ran the same
series of tests against ext3, using the same kernel:

reboot, mke2fs -j 70 MB/sec
70
69
70
71
70

So it looks like the gradual slowdown and eventual lockup is specific
to reiser4. What can I do to help you track this down? Would you like
me to write a "howto" for running this test, or would you prefer to
wait till I have an emulation of the test in dbench?

To give you an idea of the scales involved, each run lasts 100
seconds, and does approximately 1 million filesystem operations (the
exact number of operations completed in the 100 seconds is roughly
proportional to the performance result).



Ah, that explains a lot. For that kind of workload, the simpler the fs the better, because really all you are doing is adding overhead to copy_to_user and copy_from_user. All of reiser4's advanced features will add little or no value if you are staying in ram.


I'll do some runs with larger numbers of simulated clients and send
you those results shortly. Do you think a working set size of about
double the total machine memory would be a good size to start showing
the reiser4 features?

Cheers, Tridge





-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/