Re: [PATCH] maximize dispatching in block throttle

From: Hillf Danton
Date: Fri Dec 03 2010 - 09:39:49 EST


On Fri, Dec 3, 2010 at 10:32 PM, Vivek Goyal <vgoyal@xxxxxxxxxx> wrote:
> On Fri, Dec 03, 2010 at 10:26:50PM +0800, Hillf Danton wrote:
>> On Wed, Dec 1, 2010 at 10:41 PM, Vivek Goyal <vgoyal@xxxxxxxxxx> wrote:
>> > On Wed, Dec 01, 2010 at 09:30:00PM +0800, Hillf Danton wrote:
>> >> On Tue, Nov 30, 2010 at 10:57 PM, Vivek Goyal <vgoyal@xxxxxxxxxx> wrote:
>> >> > On Fri, Nov 26, 2010 at 10:46:01PM +0800, Hillf Danton wrote:
>> >> >> When dispatching bio, the quantum is divided into read/write budgets,
>> >> >> and dispatching for write could not exceed the write budget even if
>> >> >> the read budget is not exhausted, either dispatching for read.
>> >> >>
>> >> >> It is changed to exhaust the quantum, if possible, in this work for
>> >> >> dispatching bio.
>> >> >>
>> >> >> Though it is hard to understand that 50/50 division is not selected,
>> >> >> the difference between divisions could impact little on dispatching as
>> >> >> much as quantum allows then.
>> >> >>
>> >> >> Signed-off-by: Hillf Danton <dhillf@xxxxxxxxx>
>> >> >> ---
>> >> >
>> >> > Hi Hillf,
>> >> >
>> >> > Even if there are not enough READS/WRITES to consume the quantum, I don't
>> >> > think that it changes anyting much. The next dispatch round will be
>> >>
>> >> Why not exhaust quantum this round directly if not throttled?
>> >>
>> >
>> > Yes we can do that. I am wondering what do we gain by putting extra code
>> > in?
>> >
>> >> > scheduled almost immediately (If there are bios which are ready to
>> >> > be dispatched). Look at throtl_schedule_next_dispatch().
>> >> >
>> >> > Have you noticed some issues/improvements with this patch?
>> >>
>> >> Originally I wanted to get 75/25 division parameterized through sysfs or proc,
>> >> but I changed mind since exhausting quantum is simpler and much applicable.
>> >
>> > But these are two different things isn't it?
>> >
>> > We can introduce some sysfs tunables (if need be) so that a user can decide
>> > the division of READS/WRITES.
>> >
>> >>
>> >> As you see, the application environments change from one user to another,
>> >> even though are more latency sensitive.
>> >>
>> >> It looks nicer, I think, to provide both the default division and the
>> >> methods to
>> >> change for users to play their games.
>> >
>> > Yep, tunables can help here and introducing tunables is easy. At the same
>>
>> If it is too hard, please change to what is more valuable.
>> Hillf
>
> It should not be too hard. IO schedulers already create
> /sys/block/<dev>/queue/iosched/ dir and we can create <dev>/queue/throttle/
> dir and export throttle related tunables there.

I will try along this direction, thanks.

>
> What I am not clear about is that what problem are you running into.
> Tunbales and your patch to fill the quantum are two different things. So
> how can both the things solve your problem. One of these should.

I could not work it out over one week. That is all.

Thanks again for sharing so much on the tunable.

Good weekend

Hillf

>
> So if you can explain the issue you are facing with some more details, it
> will help. I am not sure how your patch of filling the quantum with
> requests of other direction is helping you?
>
> Vivek
>
>>
>> > time I prefer to intoroduce tunables only on need basis otherwise it
>> > runs the risk of too many tunables which are not being used.
>> >
>> > So, conceptually this patch looks fine to me. Jeff Moyer also raised the
>> > same question of why not fill up the quantum if READ/WRITE are not using
>> > their quota. So I think it does not hurt to take this patch in.
>> >
>> > Acked-by: Vivek Goyal <vgoyal@xxxxxxxxxx>
>> >
>> > Thanks
>> > Vivek
>> >
>> >>
>> >> Thanks
>> >> Hillf
>> >>
>> >> >
>> >> > Generally READS are more latency sensitive as compared to WRITE hence
>> >> > I thought of dispatching more READS per quantum.
>> >> >
>> >> > Thanks
>> >> > Vivek
>> >> >
>> >> >>
>> >> >> --- a/block/blk-throttle.c  Â2010-11-01 19:54:12.000000000 +0800
>> >> >> +++ b/block/blk-throttle.c  Â2010-11-26 21:49:00.000000000 +0800
>> >> >> @@ -647,11 +647,16 @@ static int throtl_dispatch_tg(struct thr
>> >> >> Â Â Â unsigned int max_nr_reads = throtl_grp_quantum*3/4;
>> >> >> Â Â Â unsigned int max_nr_writes = throtl_grp_quantum - nr_reads;
>> >> >> Â Â Â struct bio *bio;
>> >> >> + Â Â int read_throttled = 0, write_throttled = 0;
>> >> >>
>> >> >> Â Â Â /* Try to dispatch 75% READS and 25% WRITES */
>> >> >> -
>> >> >> + try_read:
>> >> >> Â Â Â while ((bio = bio_list_peek(&tg->bio_lists[READ]))
>> >> >> - Â Â Â Â Â Â && tg_may_dispatch(td, tg, bio, NULL)) {
>> >> >> + Â Â Â Â Â Â && ! read_throttled) {
>> >> >> + Â Â Â Â Â Â if (! tg_may_dispatch(td, tg, bio, NULL)) {
>> >> >> + Â Â Â Â Â Â Â Â Â Â read_throttled = 1;
>> >> >> + Â Â Â Â Â Â Â Â Â Â break;
>> >> >> + Â Â Â Â Â Â }
>> >> >>
>> >> >> Â Â Â Â Â Â Â tg_dispatch_one_bio(td, tg, bio_data_dir(bio), bl);
>> >> >> Â Â Â Â Â Â Â nr_reads++;
>> >> >> @@ -659,9 +664,15 @@ static int throtl_dispatch_tg(struct thr
>> >> >> Â Â Â Â Â Â Â if (nr_reads >= max_nr_reads)
>> >> >> Â Â Â Â Â Â Â Â Â Â Â break;
>> >> >> Â Â Â }
>> >> >> -
>> >> >> + Â Â if (! bio)
>> >> >> + Â Â Â Â Â Â read_throttled = 1;
>> >> >> + try_write:
>> >> >> Â Â Â while ((bio = bio_list_peek(&tg->bio_lists[WRITE]))
>> >> >> - Â Â Â Â Â Â && tg_may_dispatch(td, tg, bio, NULL)) {
>> >> >> + Â Â Â Â Â Â && ! write_throttled) {
>> >> >> + Â Â Â Â Â Â if (! tg_may_dispatch(td, tg, bio, NULL)) {
>> >> >> + Â Â Â Â Â Â Â Â Â Â write_throttled = 1;
>> >> >> + Â Â Â Â Â Â Â Â Â Â break;
>> >> >> + Â Â Â Â Â Â }
>> >> >>
>> >> >> Â Â Â Â Â Â Â tg_dispatch_one_bio(td, tg, bio_data_dir(bio), bl);
>> >> >> Â Â Â Â Â Â Â nr_writes++;
>> >> >> @@ -669,7 +680,23 @@ static int throtl_dispatch_tg(struct thr
>> >> >> Â Â Â Â Â Â Â if (nr_writes >= max_nr_writes)
>> >> >> Â Â Â Â Â Â Â Â Â Â Â break;
>> >> >> Â Â Â }
>> >> >> + Â Â if (! bio)
>> >> >> + Â Â Â Â Â Â write_throttled = 1;
>> >> >> +
>> >> >> + Â Â if (write_throttled && read_throttled)
>> >> >> + Â Â Â Â Â Â goto out;
>> >> >>
>> >> >> + Â Â if (! (throtl_grp_quantum > nr_writes + nr_reads))
>> >> >> + Â Â Â Â Â Â goto out;
>> >> >> +
>> >> >> + Â Â if (read_throttled) {
>> >> >> + Â Â Â Â Â Â max_nr_writes = throtl_grp_quantum - nr_reads;
>> >> >> + Â Â Â Â Â Â goto try_write;
>> >> >> + Â Â } else {
>> >> >> + Â Â Â Â Â Â max_nr_reads = throtl_grp_quantum - nr_writes;
>> >> >> + Â Â Â Â Â Â goto try_read;
>> >> >> + Â Â }
>> >> >> + out:
>> >> >> Â Â Â return nr_reads + nr_writes;
>> >> >> Â}
>> >> >> --
>> >> >> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
>> >> >> the body of a message to majordomo@xxxxxxxxxxxxxxx
>> >> >> More majordomo info at Âhttp://vger.kernel.org/majordomo-info.html
>> >> >> Please read the FAQ at Âhttp://www.tux.org/lkml/
>> >> >
>> >
>
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/