Re: [PATCH 2/2] block: adjust CFS request expire time

From: Zhaoyang Huang
Date: Tue Feb 20 2024 - 05:38:26 EST


On Tue, Feb 20, 2024 at 5:42 PM Christoph Hellwig <hch@xxxxxxxxxxxxx> wrote:
>
> On Tue, Feb 20, 2024 at 02:15:42PM +0800, zhaoyang.huang wrote:
> > From: Zhaoyang Huang <zhaoyang.huang@xxxxxxxxxx>
> >
> > According to current policy, CFS's may suffer involuntary IO-latency by
> > being preempted by RT/DL tasks or IRQ since they possess the privilege for
> > both of CPU and IO scheduler.
>
> What is 'current policy', what is CFS, what is RT/DL? What privilege
> is possessed?
CFS and RT/DL are types of sched class in which CFS has the least
privilege to get CPU.
IMO, ‘current policy’ refers to two perspectives:
1. the RT task in the same core with the CFS task gets privileges in
both CPU and IO scheduler(deadline on duty) than CFS. Could we make
the CFS requests' expire_time be earlier than it used to be now.
2. In terms of the timing of inserting the request, preempted CFS
tasks lose the fairness involuntary when compared with none-preempted
CFS tasks. Could we decrease this impact in some way.
>
> > 1. All types of sched class's load(util) are tracked and calculated in the
> > same way(using a geometric series which known as PELT)
> > 2. Keep the legacy policy by NOT adjusting rq's position in fifo_list
> > but only make changes over expire_time.
> > 3. The fixed expire time(hundreds of ms) is in the same range of cpu
> > avg_load's account series(the utilization will be decayed to 0.5 in 32ms)
>
> What problem does this fix, i.e. what performance number are improved
> or what other effects does it have?
I have verified this commit via some benchmark tools like fio and
Androbench. Neither regression nor improvement is found. By analysing
the log below[2], where I find that CFS occupies most of the CPU for
the most part. If it makes more sense in the way of [1] where CFS is
over-preempted than a threshold.

[1]
- rq->fifo_time = jiffies + dd->fifo_expire[data_dir];

/*adjust expire time when cfs is over-preempted than 50%*/
+ fifo_expire = cfs_prop_by_util(current,100) < 50 ?
dd->fifo_expire[data_dir] :
+ cfs_prop_by_util(current, dd->fifo_expire[data_dir]);
+ rq->fifo_time = jiffies + fifo_expire;

[2]
//prop is the proportion of CFS's util which is mostly above 90(90%)
during common benchmark test
kworker/u16:3-73 [000] ...1. 321.140143: dd_insert_request:
dir 1,cfs 513, prop 91, orig_expire 1250, expire 1149
kworker/u16:3-73 [000] ...1. 321.140414: dd_insert_request:
dir 1,cfs 513, prop 91, orig_expire 1250, expire 1149
kworker/u16:3-73 [000] ...1. 321.140505: dd_insert_request:
dir 1,cfs 513, prop 91, orig_expire 1250, expire 1149
kworker/u16:3-73 [000] ...1. 321.140574: dd_insert_request:
dir 1,cfs 513, prop 91, orig_expire 1250, expire 1149
kworker/u16:3-73 [000] ...1. 321.140630: dd_insert_request:
dir 1,cfs 513, prop 91, orig_expire 1250, expire 1149
kworker/u16:3-73 [000] ...1. 321.140682: dd_insert_request:
dir 1,cfs 513, prop 91, orig_expire 1250, expire 1149
kworker/u16:3-73 [000] ...1. 321.140736: dd_insert_request:
dir 1,cfs 513, prop 91, orig_expire 1250, expire 1149
dd-7296 [006] ...1. 321.143139: dd_insert_request:
dir 0,cfs 610, prop 92, orig_expire 125, expire 115
dd-7296 [006] ...1. 321.143287: dd_insert_request:
dir 0,cfs 610, prop 92, orig_expire 125, expire 115
dd-7296 [004] ...1. 321.156074: dd_insert_request:
dir 0,cfs 691, prop 97, orig_expire 125, expire 122
dd-7296 [004] ...1. 321.156202: dd_insert_request:
dir 0,cfs 691, prop 97, orig_expire 125, expire 122

>
> > + * The expire time is adjusted via calculating the proportion of
> > + * CFS's activation among whole cpu time during last several
> > + * dazen's ms.Whearas, this would NOT affect the rq's position in
> > + * fifo_list but only take effect when this rq is checked for its
> > + * expire time when at head.
> > */
>
> Please speel check the comment and fix the formatting to have white
> spaces after sentences and never exceed 80 characters in block comments.
ok.
>