Re: fio posixaio performance problem

From: Shaohua Li
Date: Thu Aug 04 2011 - 04:25:11 EST


在 2011年8月4日 下午3:44,Gui Jianfeng <guijianfeng@xxxxxxxxxxxxxx> 写道:
> On 2011-8-4 11:14, Shaohua Li wrote:
>> 在 2011年8月4日 上午10:00,Gui Jianfeng <guijianfeng@xxxxxxxxxxxxxx> 写道:
>>> On 2011-8-4 8:53, Shaohua Li wrote:
>>>> 2011/8/4 Vivek Goyal <vgoyal@xxxxxxxxxx>:
>>>>> On Wed, Aug 03, 2011 at 11:45:33AM -0400, Vivek Goyal wrote:
>>>>>> On Wed, Aug 03, 2011 at 05:48:54PM +0800, Gui Jianfeng wrote:
>>>>>>> On 2011-8-3 16:22, Shaohua Li wrote:
>>>>>>>> 2011/8/3 Gui Jianfeng <guijianfeng@xxxxxxxxxxxxxx>:
>>>>>>>>> On 2011-8-3 15:38, Shaohua Li wrote:
>>>>>>>>>> 2011/8/3 Gui Jianfeng <guijianfeng@xxxxxxxxxxxxxx>:
>>>>>>>>>>> Hi,
>>>>>>>>>>>
>>>>>>>>>>> I ran a fio test to simulate qemu-kvm io behaviour.
>>>>>>>>>>> When job number is greater than 2, IO performance is
>>>>>>>>>>> really bad.
>>>>>>>>>>>
>>>>>>>>>>> 1 thread: aggrb=15,129KB/s
>>>>>>>>>>> 4 thread: aggrb=1,049KB/s
>>>>>>>>>>>
>>>>>>>>>>> Kernel: lastest upstream
>>>>>>>>>>>
>>>>>>>>>>> Any idea?
>>>>>>>>>>>
>>>>>>>>>>> ---
>>>>>>>>>>> [global]
>>>>>>>>>>> runtime=30
>>>>>>>>>>> time_based=1
>>>>>>>>>>> size=1G
>>>>>>>>>>> group_reporting=1
>>>>>>>>>>> ioengine=posixaio
>>>>>>>>>>> exec_prerun='echo 3 > /proc/sys/vm/drop_caches'
>>>>>>>>>>> thread=1
>>>>>>>>>>>
>>>>>>>>>>> [kvmio-1]
>>>>>>>>>>> description=kvmio-1
>>>>>>>>>>> numjobs=4
>>>>>>>>>>> rw=write
>>>>>>>>>>> bs=4k
>>>>>>>>>>> direct=1
>>>>>>>>>>> filename=/mnt/sda4/1G.img
>>>>>>>>>> Hmm, the test runs always about 15M/s at my side regardless how many threads.
>>>>>>>>>
>>>>>>>>> CFQ?
>>>>>>>> yes.
>>>>>>>>
>>>>>>>>> what's the slice_idle value?
>>>>>>>> default value. I didn't change it.
>>>>>>>
>>>>>>> Hmm, I use a sata disk, and can reproduce this bug every time...
>>>>>>
>>>>>> Do you have blktrace of run with 4 jobs?
>>>>>
>>>>> I can't reproduce it too. On my sata disk single thread is getting around
>>>>> 23-24MB/s and 4 threads get around 19-20MB/sec. Some of the throughput
>>>>> is gone into seeking so that is expected.
>>>>>
>>>>> I think what you are trying to point out is idling issue. In your workload
>>>>> every thread is doing sync-idle IO. So idling is enabled on each thread.
>>>>> On my system I see that next thread preempts the current idle thread
>>>>> because they all are doing IO in nearby area of file and rq_close() is
>>>>> true hence preemption is allowed.
>>>>>
>>>>> On your system, I think somehow rq_close() is not true hence preemption
>>>>> does not take place and we continue to idle on that thread. That also
>>>>> is not necessarily too bad but it might be happening that we are waiting
>>>>> for completion of IO from some other thread before this thread (we are
>>>>> idling on) can do more writes due to some filesystem rescrition and
>>>>> that can lead to sudden throughput drop. blktrace will give some idea.
>>>> with idle, the workload fallbacks like the one thread case, I don't
>>>> expect so big reduction.
>>>> I saw some back seek in the workload because we have rq_close() preempt here.
>>>> is it possible back seek penality in the disk is big?
>>>
>>> Shaohua,
>>>
>>> what do you mean "back seek penality" here. AFAIK, back seek penality only happens
>>> when choosing next request to serve. Is there anything to do with preemption logic?
>> oh, not related per your blktrace. so we have two problems here:
>> 1. fio doesn't dispatch request in 8ms.
>> 2. no close request preempt.
>
> Yes, these're actual factors why performance is so bad.
>
>> both looks quite wield. can you post a longer blktrace output, like
>> for one second? the piece is too short.
>
> Attached.
>
>> and do you have anything others running?
>
> No.
looks the system does a write in every 8ms. This is quite wrong. does
the posixaio
engine have something wrong? can you use a new fio or try libaio please.

Thanks,
Shaohua
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/