Re: [PATCH] maximize dispatching in block throttle

From: Hillf Danton
Date: Fri Dec 03 2010 - 09:32:00 EST


On Wed, Dec 1, 2010 at 10:41 PM, Vivek Goyal <vgoyal@xxxxxxxxxx> wrote:
> On Wed, Dec 01, 2010 at 09:30:00PM +0800, Hillf Danton wrote:
>> On Tue, Nov 30, 2010 at 10:57 PM, Vivek Goyal <vgoyal@xxxxxxxxxx> wrote:
>> > On Fri, Nov 26, 2010 at 10:46:01PM +0800, Hillf Danton wrote:
>> >> When dispatching bio, the quantum is divided into read/write budgets,
>> >> and dispatching for write could not exceed the write budget even if
>> >> the read budget is not exhausted, either dispatching for read.
>> >>
>> >> It is changed to exhaust the quantum, if possible, in this work for
>> >> dispatching bio.
>> >>
>> >> Though it is hard to understand that 50/50 division is not selected,
>> >> the difference between divisions could impact little on dispatching as
>> >> much as quantum allows then.
>> >>
>> >> Signed-off-by: Hillf Danton <dhillf@xxxxxxxxx>
>> >> ---
>> >
>> > Hi Hillf,
>> >
>> > Even if there are not enough READS/WRITES to consume the quantum, I don't
>> > think that it changes anyting much. The next dispatch round will be
>>
>> Why not exhaust quantum this round directly if not throttled?
>>
>
> Yes we can do that. I am wondering what do we gain by putting extra code
> in?
>
>> > scheduled almost immediately (If there are bios which are ready to
>> > be dispatched). Look at throtl_schedule_next_dispatch().
>> >
>> > Have you noticed some issues/improvements with this patch?
>>
>> Originally I wanted to get 75/25 division parameterized through sysfs or proc,
>> but I changed mind since exhausting quantum is simpler and much applicable.
>
> But these are two different things isn't it?
>
> We can introduce some sysfs tunables (if need be) so that a user can decide
> the division of READS/WRITES.
>
>>
>> As you see, the application environments change from one user to another,
>> even though are more latency sensitive.
>>
>> It looks nicer, I think, to provide both the default division and the
>> methods to
>> change for users to play their games.
>
> Yep, tunables can help here and introducing tunables is easy. At the same

If it is too hard, please change to what is more valuable.
Hillf

> time I prefer to intoroduce tunables only on need basis otherwise it
> runs the risk of too many tunables which are not being used.
>
> So, conceptually this patch looks fine to me. Jeff Moyer also raised the
> same question of why not fill up the quantum if READ/WRITE are not using
> their quota. So I think it does not hurt to take this patch in.
>
> Acked-by: Vivek Goyal <vgoyal@xxxxxxxxxx>
>
> Thanks
> Vivek
>
>>
>> Thanks
>> Hillf
>>
>> >
>> > Generally READS are more latency sensitive as compared to WRITE hence
>> > I thought of dispatching more READS per quantum.
>> >
>> > Thanks
>> > Vivek
>> >
>> >>
>> >> --- a/block/blk-throttle.c  Â2010-11-01 19:54:12.000000000 +0800
>> >> +++ b/block/blk-throttle.c  Â2010-11-26 21:49:00.000000000 +0800
>> >> @@ -647,11 +647,16 @@ static int throtl_dispatch_tg(struct thr
>> >> Â Â Â unsigned int max_nr_reads = throtl_grp_quantum*3/4;
>> >> Â Â Â unsigned int max_nr_writes = throtl_grp_quantum - nr_reads;
>> >> Â Â Â struct bio *bio;
>> >> + Â Â int read_throttled = 0, write_throttled = 0;
>> >>
>> >> Â Â Â /* Try to dispatch 75% READS and 25% WRITES */
>> >> -
>> >> + try_read:
>> >> Â Â Â while ((bio = bio_list_peek(&tg->bio_lists[READ]))
>> >> - Â Â Â Â Â Â && tg_may_dispatch(td, tg, bio, NULL)) {
>> >> + Â Â Â Â Â Â && ! read_throttled) {
>> >> + Â Â Â Â Â Â if (! tg_may_dispatch(td, tg, bio, NULL)) {
>> >> + Â Â Â Â Â Â Â Â Â Â read_throttled = 1;
>> >> + Â Â Â Â Â Â Â Â Â Â break;
>> >> + Â Â Â Â Â Â }
>> >>
>> >> Â Â Â Â Â Â Â tg_dispatch_one_bio(td, tg, bio_data_dir(bio), bl);
>> >> Â Â Â Â Â Â Â nr_reads++;
>> >> @@ -659,9 +664,15 @@ static int throtl_dispatch_tg(struct thr
>> >> Â Â Â Â Â Â Â if (nr_reads >= max_nr_reads)
>> >> Â Â Â Â Â Â Â Â Â Â Â break;
>> >> Â Â Â }
>> >> -
>> >> + Â Â if (! bio)
>> >> + Â Â Â Â Â Â read_throttled = 1;
>> >> + try_write:
>> >> Â Â Â while ((bio = bio_list_peek(&tg->bio_lists[WRITE]))
>> >> - Â Â Â Â Â Â && tg_may_dispatch(td, tg, bio, NULL)) {
>> >> + Â Â Â Â Â Â && ! write_throttled) {
>> >> + Â Â Â Â Â Â if (! tg_may_dispatch(td, tg, bio, NULL)) {
>> >> + Â Â Â Â Â Â Â Â Â Â write_throttled = 1;
>> >> + Â Â Â Â Â Â Â Â Â Â break;
>> >> + Â Â Â Â Â Â }
>> >>
>> >> Â Â Â Â Â Â Â tg_dispatch_one_bio(td, tg, bio_data_dir(bio), bl);
>> >> Â Â Â Â Â Â Â nr_writes++;
>> >> @@ -669,7 +680,23 @@ static int throtl_dispatch_tg(struct thr
>> >> Â Â Â Â Â Â Â if (nr_writes >= max_nr_writes)
>> >> Â Â Â Â Â Â Â Â Â Â Â break;
>> >> Â Â Â }
>> >> + Â Â if (! bio)
>> >> + Â Â Â Â Â Â write_throttled = 1;
>> >> +
>> >> + Â Â if (write_throttled && read_throttled)
>> >> + Â Â Â Â Â Â goto out;
>> >>
>> >> + Â Â if (! (throtl_grp_quantum > nr_writes + nr_reads))
>> >> + Â Â Â Â Â Â goto out;
>> >> +
>> >> + Â Â if (read_throttled) {
>> >> + Â Â Â Â Â Â max_nr_writes = throtl_grp_quantum - nr_reads;
>> >> + Â Â Â Â Â Â goto try_write;
>> >> + Â Â } else {
>> >> + Â Â Â Â Â Â max_nr_reads = throtl_grp_quantum - nr_writes;
>> >> + Â Â Â Â Â Â goto try_read;
>> >> + Â Â }
>> >> + out:
>> >> Â Â Â return nr_reads + nr_writes;
>> >> Â}
>> >> --
>> >> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
>> >> the body of a message to majordomo@xxxxxxxxxxxxxxx
>> >> More majordomo info at Âhttp://vger.kernel.org/majordomo-info.html
>> >> Please read the FAQ at Âhttp://www.tux.org/lkml/
>> >
>
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/