Re: Asynch I/O gets easily overloaded on 2.2.15 and 2.3.99

From: Andrea Arcangeli (andrea@suse.de)
Date: Tue Apr 11 2000 - 09:16:36 EST


On Tue, 11 Apr 2000, Andi Kleen wrote:

>I was more thinking about lots of wakeups in get_request_wait causing
>the elevator to do much work (kupdate is single threaded so there are
>only a few wakeups). With lots of threads calling ll_rw_block in
>parallel it may look differently.

With the previous elevator code in 2.3.5x if there wasn't available
requests you are right, the revalidation was quite expensive. However with
the 2.3.99-prex I fixed that and now the only slowdown we have in the
wait_for_request case (except the wait_for_request itself of course :) is
this:

                /* revalidate elevator */
                head = &q->queue_head;
                if (q->head_active && !q->plugged)
                        head = head->next;

and that's very fast indeed and certainly not visible in any number.

Also as you said since I made the wakeup event wake-one (after checking
that all places that was releasing _1_ request was doing also _1_ wakeup)
the scenario with an huge number of readers shouldn't cause overscheduling
anymore. So basically I don't think the I/O layer is the bottleneck.

>Hmm, shouldn't a lot of the profiling hit the final sti when the interrupts
>are enabled again ?

You can't be able anymore to understand if the bottleneck is the
elevator/request-merging-code/get_request. Anyway you're right you'll be
able to say the request queue it's too long, agreed ;).

However the __sti() is only a few asm instruction before returning from
ll_rw_block. So depending on the details of the architecture the profiling
could even hit outside the I/O layer.

>> Anyway I'm fairly confident that the profiler will show the real culprit
>> (I guess Jeff is queueing into the buffer hashtable an insane number of
>> buffers and that is causing complexity troubles due too much collisions).
>> If that's the case you'll see you'll see an huge number in the
>> get_hash_table entry in the profiling.
>>
>> Also last time I checked the buffer hash was been shrunk because in 2.3.x
>> the buffer cache isn't used for the data write I/O but the raw devices can
>> still be used to read/write without a filesystem...
>
>Good point. inode hash is too big, buffer hash is too small ...

ihash is 16k buckets and the icache can grow around 16k easily on machines
with good amount of memory. It should be made dynamic though.

Andrea

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu
Please read the FAQ at http://www.tux.org/lkml/



This archive was generated by hypermail 2b29 : Sat Apr 15 2000 - 21:00:16 EST